Skip to content
View rcap107's full-sized avatar

Block or report rcap107

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
rcap107/README.md

Hello! I'm Riccardo.

I am a research engineer working on the Skrub library at Inria Saclay.

I am mostly interested in tabular data cleaning and processing, information retrieval, feature selection, and how to optimize machine learning tasks around that. I have also worked with word and graph embeddings, graph neural networks, and their application to data curation tasks.

I work mostly with Python and its data science libraries (Polars, Pandas, scikit-learn, skrub, numpy, matplotlib, seaborn...).

My main projects are:

If you want to reach out, you may contact me on Linkedin or send me an email to riccardo[dot]cappuzzo[at]protonmail[dot]me.

Pinned Loading

  1. embdi embdi Public

    EmbDI is a table embeddings algorithm that solves data integration problems by converting tabular data into graphs, then applying word2vec to the graph to obtain embeddings.

    Jupyter Notebook 7 1

  2. retrieve-merge-predict retrieve-merge-predict Public

    Jupyter Notebook 9 2

  3. YADL YADL Public

    Jupyter Notebook 2