Natural Language Processing Icon

Natural Language Processing

Natural language processing (NLP) is the study of how computers and humans interact.
27 Stories
All Topics

Tooling github.com

Search inside YouTube videos using natural language

Use OpenAI’s CLIP neural network to search inside YouTube videos. You can try it by running the notebook on Google Colab.

The README has a bunch of examples of things you might search for and the results you’d get back. (“The Transamerica Pyramid”, anyone?)

The author also has another related project where you can search Unsplash in like manner.

Ines Montani github.com

Introducing spaCy 3.0

You may recall spaCy from this episode of Practical AI with its creators. If not, now’s a great time to introduce yourself to the project. 3.0 looks like a fantastic new release of the wildly popular NLP library. The list of new and improved things is too long for me to reproduce here, so go check it out for yourself.

There’s also three YouTube videos accompanying the release. That’s evidence of just how much effort and polish went in to this.

TechCrunch Icon TechCrunch

Hugging Face raises $15 million to build their open source NLP library 🤗

Congrats to Clément and the Hugging Face team on this milestone!

The company first built a mobile app that let you chat with an artificial BFF, a sort of chatbot for bored teenagers. More recently, the startup released an open-source library for natural language processing applications. And that library has been massively successful.

The library mentioned is called Transformers, which is dubbed as ‘state-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.’

If any of these things ring a bell to you, it may be because Practical AI co-host Daniel Whitenack has been a huge supporter of Hugging Face for a long time and mentions them often on the show. We even had Clément on the show back in March of this year.

Practices github.com

Natural Language Processing best practices & examples

The goal of this repository is to build a comprehensive set of tools and examples that leverage recent advances in NLP algorithms, neural architectures, and distributed machine learning systems. The content is based on our past and potential future engagements with customers as well as collaboration with partners, researchers, and the open source community.

Google github.com

Using Google's speech recognition to beat Google's ReCaptcha

A little ingenuity paired with changes to ReCaptcha’s audio challenge allowed this hacker to create a Python ‘robot’ that defeats the ‘not a robot’ test with 90% accuracy. The approach is brilliant:

  1. Navigate to Google’s ReCaptcha Demo site
  2. Navigate to audio challenge for ReCaptcha
  3. Download audio challenge
  4. Submit audio challenge to Speech To Text
  5. Parse response and type answer
  6. Press submit and check if successful

The code is small enough to grok in 5-10 minutes. Love it!

Using Google's speech recognition to beat Google's ReCaptcha

TensorFlow cvcompiler.com

An NLP tool for improving dev resumes

CV Compiler is an online resume analysis tool designed exclusively for software engineers.

The review technology scans for keywords from the world of programming and how they are used in the resume, relative to the best practices in the industry.

CV Compiler was built using Python with libraries NLTK and spaCy for tokenization, lemmatization, and POS-tagging.

The internal analysis engine for large datasets (resumes, job descriptions) was built upon a Seq2Seq model in TensorFlow.

0:00 / 0:00