🎉First release of skrub 0.1.0 http://skrub-data.org
Couple dataframes and databases to machine learning to facilitate data prep
✨Less data wrangling, more machine learning✨
This is a young project that I am very excited about:
🧵👇
1/8
Absolutely loving these little API diagrams from the scikit-learn mooc. (https://inria.github.io/scikit-learn-mooc/)
It was also already worth the price of admission to see @GaelVaroquaux get handed some irises as a prop to explain the iris dataset! 🤣👏👏👏
New Blog Post! Running Python Parallel Applications with Sub Interpreters. What is a sub interpreter? What does this have to do with No-GIL? What use is this? Can I use it for web apps? This and more questions answered. https://tonybaloney.github.io/posts/sub-interpreter-web-workers.html #python #django #flask #fastapi
🎉 Tool for better documentation!! Release of sphinx-gallery, to automatically integrate narrative 🐍 examples in documentations
https://sphinx-gallery.github.io/stable/index.html
Highlight: a light recommender system to show related examples
An illustration of sphinx-gallery:
https://scikit-learn.org/dev/auto_examples/inspection/plot_linear_model_coefficient_interpretation.html
(from @sklearn 's gallery). Note the links to function docs.
Sphinx-gallery comes with awesome features such as
◼online execution with binder or jupyterlite
◼mini-galleries eg to link an object's docstring to its examples
Sampling bias in practice: conducting a survey on the Paris metro platform...
if you ask people where they get off, you'll get a different distribution depending on where on the platform you stand: people choose their position close to the exits at arrival.
I’ll be giving the online lecture on "Representation learning on relational data to automate data preparation" on November 15th, 7pm EEST at AIHouse Ukraine.
Join the lecture, learn and support Ukraine
https://aihouse.org.ua/en/ai-for-ukraine/
📑 "healthwashing": verb [ I or T ]
to make people believe that your computer-science grant or paper is about trying to improve health, while it really is an excuse to do maths and maybe you have a few biomedical signals on a thumb drive
⚕️💻
cloudpickle 3.0.0 is out!
https://github.com/cloudpipe/cloudpickle
cloudpickle is a library used by PySpark, Dask, Ray and joblib / loky (among others) to make it possible to call dynamically or locally defined function, closures and lambdas on remote Python process workers.
This is typically necessary for running code in parallel on a distributed computing cluster from an interactive developer environment such as a Jupyter or Databricks notebooks.
🔴 19 fixes
😃 74 contributors
📢 Bugfix release - scikit-learn 1.3.1 is out!
More details in the changelog: https://bit.ly/3rpA33J
You can upgrade with pip as usual:
pip install -U scikit-learn
or using the conda-forge builds:
conda install -c conda-forge scikit-learn
Thanks to all the contributors!
#data #Python #software #ML #opensource #pydata #scipy #sklearn #machinelearning
🤖 I am honored to have been appointed to the government-level panel of experts on AI 🇫🇷.
We are tasked with suggesting a national vision and strategy in France.
The panel is made of experts on different topics: economics, law, computer science, from academia, industry, non-profits
https://gael-varoquaux.info/science/comite-de-lintelligence-artificielle-vision-et-strategie-nationale.html