{"id":10359,"date":"2025-05-27T10:01:25","date_gmt":"2025-05-27T08:01:25","guid":{"rendered":"https:\/\/via-internet.de\/blog\/?p=10359"},"modified":"2025-05-27T10:24:38","modified_gmt":"2025-05-27T08:24:38","slug":"daily-how-to-transcribe-videos","status":"publish","type":"post","link":"https:\/\/via-internet.de\/blog\/2025\/05\/27\/daily-how-to-transcribe-videos\/","title":{"rendered":"Daily: How to transcribe Videos"},"content":{"rendered":"\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"771\" src=\"https:\/\/via-internet.de\/blog\/wp-content\/uploads\/2025\/05\/ChatGPT-Image-27.-Mai-2025-10_22_56-1.png\" alt=\"\" class=\"wp-image-10371\" srcset=\"https:\/\/via-internet.de\/blog\/wp-content\/uploads\/2025\/05\/ChatGPT-Image-27.-Mai-2025-10_22_56-1.png 1024w, https:\/\/via-internet.de\/blog\/wp-content\/uploads\/2025\/05\/ChatGPT-Image-27.-Mai-2025-10_22_56-1-300x226.png 300w, https:\/\/via-internet.de\/blog\/wp-content\/uploads\/2025\/05\/ChatGPT-Image-27.-Mai-2025-10_22_56-1-768x578.png 768w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Using Commandline and Python<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Option 1: Transcribe with Whisper (official)<\/h3>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">pip install git+https:\/\/github.com\/openai\/whisper.git<\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Then run:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">whisper demo.mp4 --model medium --language auto --output_format txt<\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Optional flags:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>--output_format srt<\/code> (for subtitles)<\/li>\n\n\n\n<li><code>--language en<\/code> (to skip language detection)<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Create a shell script to transcribe multiple files<\/h4>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">#!\/usr\/bin\/env bash\n\n# Loop through all arguments (filenames)\nfor VIDEO in \"$@\"; do\n    echo \"Transcribing: $VIDEO\"\n    whisper \"$VIDEO\" --model medium --language en --output_format txt\ndone<\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Option 2: Transcribe with faster-whisper (much faster on CPU or GPU)<\/h3>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">pip install faster-whisper<\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Run via Python:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">import sys\nfrom faster_whisper import WhisperModel\n\nVIDEO=sys.arg[1]\n\nmodel = WhisperModel(\"medium\", device=\"cpu\")  # or \"cuda\" for GPU\nsegments, info = model.transcribe(VIDEO)\n\nwith open(\"transcript.txt\", \"w\") as f:\n    for s in segments:\n        f.write(f\"{s.start:.2f} --> {s.end:.2f}: {s.text.strip()}\\n\")<\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Optional: Convert to audio (if needed)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">If transcription fails or is slow, extract audio first:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">ffmpeg -i your_video.mp4 -ar 16000 -ac 1 -c:a pcm_s16le audio.wav<\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Then transcribe <code>audio.wav<\/code> instead.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\"> Problems with numpy<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Error when running whisper: A module that was compiled using NumPy 1.x cannot be run in NumPy 2.2.6 as it may crash.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Reason:<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">The error you\u2019re seeing comes from an incompatibility between <strong>NumPy 2.x<\/strong> and some Whisper dependencies that were compiled against <strong>NumPy 1.x<\/strong>.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Solution: Downgrade NumPy<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">You can fix this by <strong>downgrading NumPy to version 1.x<\/strong>:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">pip install \"numpy&lt;2\"<\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Then run Whisper again:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">whisper demo.mp4 --model medium --language auto --output_format txt<\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Hide warning UserWarning: FP16 is not supported on CPU; using FP32 instead<\/h3>\n\n\n\n<h3 class=\"wp-block-heading\">Option 1: Suppress all Python warnings (quick + global)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">In your terminal or script, set the environment variable:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">PYTHONWARNINGS=\"ignore\" whisper dmeo.mp4 --model medium&lt;br>&lt;\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Or, in Python code:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">import warnings\n\nwarnings.filterwarnings(\"ignore\")<\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Option 2: Suppress only that specific warning<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">If you&#8217;re using <code>faster-whisper<\/code> in Python and want to filter only that one:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">import warnings\n\nwarnings.filterwarnings(\n    \"ignore\",\n    message=\"FP16 is not supported on CPU; using FP32 instead\"\n)<\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Option 3: Patch the library (if you&#8217;re comfortable)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">You can find the line in the <code>faster_whisper<\/code> source code (usually in <code>transcribe.py<\/code>) that issues the warning and comment it out or remove it:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\"># warnings.warn(\"FP16 is not supported on CPU; using FP32 instead\")<\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Not recommended unless you maintain the code.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Using Commandline and Python Option 1: Transcribe with Whisper (official) Then run: Optional flags: Create a shell script to transcribe multiple files Option 2: Transcribe with faster-whisper (much faster on CPU or GPU) Run via Python: Optional: Convert to audio (if needed) If transcription fails or is slow, extract audio first: Then transcribe audio.wav instead. Problems with numpy Error when running whisper: A module that was compiled using NumPy 1.x cannot be run in NumPy 2.2.6 as it may crash. Reason: The error you\u2019re seeing comes from an incompatibility between NumPy 2.x and some Whisper dependencies that were compiled against NumPy [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_crdt_document":"","_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[133,171],"class_list":["post-10359","post","type-post","status-publish","format-standard","hentry","category-allgemein","tag-daily","tag-transcribe-video"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/via-internet.de\/blog\/wp-json\/wp\/v2\/posts\/10359","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/via-internet.de\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/via-internet.de\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/via-internet.de\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/via-internet.de\/blog\/wp-json\/wp\/v2\/comments?post=10359"}],"version-history":[{"count":5,"href":"https:\/\/via-internet.de\/blog\/wp-json\/wp\/v2\/posts\/10359\/revisions"}],"predecessor-version":[{"id":10372,"href":"https:\/\/via-internet.de\/blog\/wp-json\/wp\/v2\/posts\/10359\/revisions\/10372"}],"wp:attachment":[{"href":"https:\/\/via-internet.de\/blog\/wp-json\/wp\/v2\/media?parent=10359"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/via-internet.de\/blog\/wp-json\/wp\/v2\/categories?post=10359"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/via-internet.de\/blog\/wp-json\/wp\/v2\/tags?post=10359"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}