Python Icon

Python

Python is a dynamically typed programming language.
231 Stories
All Topics

Practical AI Practical AI #138

Multi-GPU training is hard (without PyTorch Lightning)

William Falcon wants AI practitioners to spend more time on model development, and less time on engineering. PyTorch Lightning is a lightweight PyTorch wrapper for high-performance AI research that lets you train on multiple-GPUs, TPUs, CPUs and even in 16-bit precision without changing your code! In this episode, we dig deep into Lightning, how it works, and what it is enabling. William also discusses the Grid AI platform (built on top of PyTorch Lightning). This platform lets you seamlessly train 100s of Machine Learning models on the cloud from your laptop.

Opensource.com Icon Opensource.com

How I teach Python on the Raspberry Pi 400 at the public library

Don Watkins:

Mark Van Doren said, “the art of teaching is the art of assisting discovery.” I saw that play out in this classroom using open source tools. More students need opportunities like this to help them gain a quality education. The Raspberry Pi 400 is a great form factor for teaching and learning.

Such a cool program that’d be easy to reproduce in your local library.

The Changelog The Changelog #444

Every commit is a gift

Maintainer Week is finally here and we’re excited to make this an annual thing! If Maintainer Week is new to you, check out episode #442 with Josh Simmons and Kara Sowles.

Today we’re talking Brett Cannon. Brett is Dev Manager of the Python Extension for VS Code, Python Steering Council Member, and core team member for Python. He recently shared a blog post The social contract of open source, so we invited Brett to join us for Maintainer Week to discuss this topic in detail.

Thank a maintainer on us! We’re printing a limited run t-shirt that’s free for maintainers, and all you gotta do is thank them, today!

MongoDB github.com

Mongita is to MongoDB as SQLite is to SQL

Mongita is a lightweight embedded document database that implements a commonly-used subset of the MongoDB/PyMongo interface. Mongita differs from MongoDB in that instead of being a server, Mongita is a self-contained Python library. Mongita can be configured to store its documents either on disk or in memory.

I can’t speak to the implementation, but I love the idea behind this project. Already know and love Mongo? Here’s a way to use it in an embedded fashion with all of the advantages that come with such an architecture…

Coronavirus github.com

A bot to notify you when vaccine appointments are available

Supports checking Hy-Vee, Cosentino’s stores (KC), Ball’s stores (KC), Rapid Test KC, and locations checked by VaccineSpotter (including Walmart, Walgreens, CVS, Costco).

Supports sending notifications to Slack, Discord, Microsoft Teams, Twilio, and Twitter.

Notifications are sent when a location has appointments. No more notifications are sent for that location until it becomes unavailable again.

Bitcoin cryptomaton.org

How to create a trading bot that buys Bitcoin when Elon Musk tweets about it

Why do something like this? For the fun of it, mostly. Definitely not for this reason:

By creating a crypto trading bot that buys bitcoin every time the Tesla boss tweets about it you can rest assured that you are going to catch a VIP seat on the rocket that will slingshot right past the moon and make its way directly to Mars, where Elon spends most of the summer months due to its cold weather and dry climate.

Lulz aside, I love posts like this because they demonstrate how someone tied together a bunch of disparate things (Twitter API, trading API, regular expressions, etc.) to accomplish a real thing, no matter how silly/foolish that real thing is.

Also check out part 2 where he adds sentiment analysis. (Although, it’s hard for me –a human– to decipher Elon Musk’s tweets, so the results of said analysis are probably no better than flipping a coin.)

Nikita Sobolev sobolevn.me

Make tests a part of your app

Here’s a pretty useful idea for library authors and their users: there are better ways to test your code!

I give three examples of how user projects can be self-tested without actually writing any real test cases by the end-user. One is hypothetical about django and two examples are real and working: featuring deal and dry-python/returns. A brief example with deal:

import deal

@deal.pre(lambda a, b: a >= 0 and b >= 0)
@deal.raises(ZeroDivisionError)  # this function can raise if `b=0`, it is ok
def div(a: int, b: int) -> float:
    if a > 50:  # Custom, in real life this would be a bug in our logic:
        raise Exception('Oh no! Bug happened!')
    return a / b

This bug can be automatically found by writing a single line of test code: test_div = deal.cases(div). As easy as it gets! From this article you will learn:

  • How to use property-based testing on the next level
  • How a simple decorator @deal.pre(lambda a, b: a >= 0 and b >= 0) can help you to generate hundreds of test cases with almost no effort
  • What “Monad laws as values” is all about and how dry-python/returns helps its users to build their own monads

I really like this idea! And I would appreciate your feedback on it.

Python github.com

A PyTorch-based speech toolkit

SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.

The goal is to create a single, flexible, and user-friendly toolkit that can be used to easily develop state-of-the-art speech technologies, including systems for speech recognition, speaker recognition, speech enhancement, multi-microphone signal processing and many others.

Currently in beta.

Python github.com

`whereami` uses WiFi signals & ML to locate you (within 2-10 meters)

If you’re adventurous and you want to learn to distinguish between couch #1 and couch #2 (i.e. 2 meters apart), it is the most robust when you switch locations and train in turn. E.g. first in Spot A, then in Spot B then start again with A. Doing this in spot A, then spot B and then immediately using “predict” will yield spot B as an answer usually. No worries, the effect of this temporal overfitting disappears over time. And, in fact, this is only a real concern for the very short distances. Just take a sample after some time in both locations and it should become very robust.

The linked project was “almost entirely copied” from the find project, which was written in Go. It then went on to inspire whereami.js. I bet you can guess what that is.

Dropbox Tech Blog Icon Dropbox Tech Blog

Our journey from a Python monolith to a managed platform

Dropbox Engineering tells the tale of their new SOA:

The majority of software developers at Dropbox contribute to server-side backend code, and all server side development takes place in our server monorepo. We mostly use Python for our server-side product development, with more than 3 million lines of code belonging to our monolithic Python server.

It works, but we realized the monolith was also holding us back as we grew.

This is an excellent, deep re-telling of their goals, decisions, setbacks, and progress. Here’s the major takeaway, if you don’t have time for a #longread:

The single most important takeaway from this multi-year effort is that well-thought-out code composition, early in a project’s lifetime, is essential. Otherwise, technical debt and code complexity compounds very quickly.

Python github.com

A semantic diffing tool for tree-like structures (JSON, XML, HTML, etc)

Graphtage is a commandline utility and underlying library for semantically comparing and merging tree-like structures, such as JSON, XML, HTML, YAML, plist, and CSS files. Its name is a portmanteau of “graph” and “graftage”—the latter being the horticultural practice of joining two trees together such that they grow as one.

A semantic diffing tool for tree-like structures (JSON, XML, HTML, etc)

Ines Montani github.com

Introducing spaCy 3.0

You may recall spaCy from this episode of Practical AI with its creators. If not, now’s a great time to introduce yourself to the project. 3.0 looks like a fantastic new release of the wildly popular NLP library. The list of new and improved things is too long for me to reproduce here, so go check it out for yourself.

There’s also three YouTube videos accompanying the release. That’s evidence of just how much effort and polish went in to this.

Python pip.pypa.io

Pip has dropped support for Python 2

This has been in the works for ~2 years now and finally dropped on January 23rd, 2021. It’s amazing how much work it takes to upgrade a community as large and broadly-interested as Python’s.

Getting the de facto tool for installing packages off Python 2 seems like a pretty moment in that effort to me, but I’m only a casual observer/fan of the language. I’d love to from folks who use Python on the daily.. Is this a big deal?

Python github.com

Apache Superset – a data visualization and data exploration platform

Superset can query data from any SQL-speaking datastore or data engine (e.g. Presto or Athena) that has a Python DB-API driver and a SQLAlchemy dialect.

This has been around long enough to be picked up by the Apache Foundation, but somehow it’s avoided my radar until today. The visualizations you can achieve with it are impressive, to say the least.

Apache Superset – a data visualization and data exploration platform

Twitter Icon Twitter

Guido van Rossum comes out of retirement, joins Microsoft

Guido van Rossum:

I decided that retirement was boring and have joined the Developer Division at Microsoft. To do what? Too many options to say! But it’ll make using Python better for sure (and not just on Windows :-). There’s lots of open source here. Watch this space.

Late last year Guido left Dropbox to head into retirement. Apparently “retirement was boring.” I’m curious to see how coming out of retirement changes things at the steering level of Python.

We talked mid last year with Brett Cannon about Python’s new governance and core team. I don’t recall their plan accounting for the possibility for their BDFL to come back from retirement. 😱

I’m sure whatever is to come for Python with Guido being back, it’ll be a net positive.

Craig Kerstiens info.crunchydata.com

Building a recommendation engine inside Postgres with Python and Pandas

Craig Kerstiens told me about this on our recent Postgres episode of The Changelog and my jaw about dropped out of my mouth.

… earlier today I was starting to wonder why couldn’t I do more machine learning directly inside [Postgres]. Yeah, there is madlib, but what if I wanted to write my own recommendation engine? So I set out on a total detour of a few hours and lo and behold, I can probably do a lot more of this in Postgres than I realized before. What follows is a quick walkthrough of getting a recommendation engine setup directly inside Postgres.

Craig doesn’t necessarily suggest you put this kind of solution in production, but he doesn’t come out and say don’t do it either. 😉

Líkið Geimfari github.com

A high-performance fake data generator for Python

Mimesis… provides data for a variety of purposes in a variety of languages. The fake data could be used to populate a testing database, create fake API endpoints, create JSON and XML files of arbitrary structure, anonymize data taken from production and etc.

Data generators like Mimesis are fun to use (and I imagine fun to code as well):

>>> from mimesis import Person
>>> person = Person('en')

>>> person.full_name()
'Brande Sears'

>>> person.email(domains=['mimesis.name'])
'roccelline1878@mimesis.name'

>>> person.email(domains=['mimesis.name'], unique=True)
'f272a05d39ec46fdac5be4ac7be45f3f@mimesis.name'

>>> person.telephone(mask='1-4##-8##-5##3')
'1-436-896-5213'
0:00 / 0:00