Assessing difficulty of French podcasts

I was looking for French podcasts recently and found it difficult to find ones that were interesting but still understandable. I thought this would be an interesting programming project, so I created a script to assess the oral comprehension difficulty of French podcasts.

How the script works

  1. It takes the Spotify URL for a French podcast link
  2. It downloads a 30-60 sec excerpt of the podcast using the Spotify API
  3. It transcribes the audio using Google Cloud Speech API (French words only)
  4. It checks what % of the words are uncommon. It does this by checking what % of the words are not in a list of the 1000 most common French words. Both the list of 1000 common words and the transcript are lemmatized so that conjugated verbs and plurals are still identified as the same word. E.g., so that “suis” and “es” both match “être”.
  5. It estimates the talking speed by calculating how many words were spoken per minute (number of French words in transcript / excerpt duration in seconds * 60)


See output of the script run against 28 popular French podcasts below. Note that the script only transcribes French words. So podcasts that also contain English (e.g., the Duolingo podcast) will have a low amount of French words per minute. However, that still reflects an easier podcast to understand so kept it as is.

To make the podcasts easier to rank I added a score/rank for each metric. So the podcast with the lowest degree of uncommon words/words per min has rank 1 in that metric, the one with the highest has 28. I then combined the ranks/scores for the two metrics into a combined score/rank to make it easier to see which podcasts are easier to understand taking both metrics in account.

Overall this was a fun experiment. I might continue building this out with better logic in the future.

All the code is available here

NameCombined score/rankPct uncommon wordsFrench words per minURL
Duolingo French Podcast315%47Link
One Thing In A French Day1321%126Link
Hondelatte Raconte – Christophe Hondelatte1926%115Link
L’Heure du Monde1922%140Link
La société de minuit2028%89Link
Coffee Break French2130%77Link
Pépites d’Histoire2326%130Link
Mythes et Légendes2425%148Link
La Story2730%124Link
Easy French: Learn French through authentic conversations | Conversations authentiques pour apprendre le français2727%130Link
Learn French by Podcast2835%50Link
Podcast Français Authentique2826%167Link
HVF – Histoires Vraies et Flippantes3034%109Link
French Through Stories3246%75Link
Ces questions que tout le monde se pose3222%245Link
Les Baladeurs3327%175Link
Entrez dans l’Histoire3325%189Link
Le Précepteur3635%127Link
Le Podkatz3933%144Link
Canapé Six Places4132%174Link
Passe le plaid4131%178Link
BURGER RING4129%191Link
French with Jeanne4337%141Link
Little Talk in Slow French : Learn French through conversations4331%181Link
J’ai peur, donc j’y vais4731%197Link
Les actus du jour – Hugo Décrypte4835%177Link

Leave a Reply

Your email address will not be published. Required fields are marked *