Natural Language Processing Icon

Natural Language Processing

Natural language processing (NLP) is the study of how computers and humans interact.
14 Stories
All Topics

Practical AI Practical AI #84

COVID-19 Q&A and CORD-19

So many AI developers are coming up with creative, useful COVID-19 applications during this time of crisis. Among those are Timo from Deepset-AI and Tony from Intel. They are working on a question answering system for pandemic-related questions called COVID-QA. In this episode, they describe the system, related annotation of the CORD-19 data set, and ways that you can contribute!

Practical AI Practical AI #82

Speech recognition to say it just right

Catherine Breslin of Cobalt joins Daniel and Chris to do a deep dive on speech recognition. She also discusses how the technology is integrated into virtual assistants (like Alexa) and is used in other non-assistant contexts (like transcription and captioning). Along the way, she teaches us how to assemble a lexicon, acoustic model, and language model to bring speech recognition to life.

Practical AI Practical AI #78

NLP for the world's 7000+ languages

Expanding AI technology to the local languages of emerging markets presents huge challenges. Good data is scarce or non-existent. Users often have bandwidth or connectivity issues. Existing platforms target only a small number of high-resource languages.

Our own Daniel Whitenack (data scientist at SIL International) and Dan Jeffries (from Pachyderm) discuss how these and related problems will only be solved when AI technology and resources from industry are combined with linguistic expertise from those on the ground working with local language communities. They have illustrated this approach as they work on pushing voice technology into emerging markets.

TechCrunch Icon TechCrunch

Hugging Face raises $15 million to build their open source NLP library 🤗

Congrats to Clément and the Hugging Face team on this milestone!

The company first built a mobile app that let you chat with an artificial BFF, a sort of chatbot for bored teenagers. More recently, the startup released an open-source library for natural language processing applications. And that library has been massively successful.

The library mentioned is called Transformers, which is dubbed as ‘state-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.’

If any of these things ring a bell to you, it may be because Practical AI co-host Daniel Whitenack has been a huge supporter of Hugging Face for a long time and mentions them often on the show. We even had Clément on the show back in March of this year.

Practical AI Practical AI #68

Modern NLP with spaCy

SpaCy is awesome for NLP! It’s easy to use, has widespread adoption, is open source, and integrates the latest language models. Ines Montani and Matthew Honnibal (core developers of spaCy and co-founders of Explosion) join us to discuss the history of the project, its capabilities, and the latest trends in NLP. We also dig into the practicalities of taking NLP workflows to production. You don’t want to miss this episode!

Google github.com

Using Google's speech recognition to beat Google's ReCaptcha

A little ingenuity paired with changes to ReCaptcha’s audio challenge allowed this hacker to create a Python ‘robot’ that defeats the ‘not a robot’ test with 90% accuracy. The approach is brilliant:

  1. Navigate to Google’s ReCaptcha Demo site
  2. Navigate to audio challenge for ReCaptcha
  3. Download audio challenge
  4. Submit audio challenge to Speech To Text
  5. Parse response and type answer
  6. Press submit and check if successful

The code is small enough to grok in 5-10 minutes. Love it!

Using Google's speech recognition to beat Google's ReCaptcha

TensorFlow cvcompiler.com

An NLP tool for improving dev resumes

CV Compiler is an online resume analysis tool designed exclusively for software engineers.

The review technology scans for keywords from the world of programming and how they are used in the resume, relative to the best practices in the industry.

CV Compiler was built using Python with libraries NLTK and spaCy for tokenization, lemmatization, and POS-tagging.

The internal analysis engine for large datasets (resumes, job descriptions) was built upon a Seq2Seq model in TensorFlow.

0:00 / 0:00