duangsuse::Echo
#py #tool 浏览器缓存 图片 爬虫 python a.py 'x\.com' 200 ~/.cache/chromium/Default/Cache/Cache_Data/* python imgdump.py 'www' 100 `ls --sort time --reverse ~/.cache/chromium/Default/Cache/Cache_Data/*` #code import re,struct, os def ls_cache(urlRegex, kbSizeMin,…
#security #linux #reveng https://mastodon.social/@AndresFreundTec/112180083704606941
https://boehs.org/node/everything-i-know-about-the-xz-backdoor
https://twitter.com/Blankwonder/status/1773921956615877110
发现后门的人还只是一个 pg 社区的开发者
#tool srt 我来示范一下怎么获取全集字幕吧,《国宝特工》
https://pypi.org/project/SpeechRecognition/ 也可以用Azure等云服务
最后,如果能买API Key 的话可以直接用
https://platform.openai.com/docs/api-reference/audio/createTranscription#audio-createtranscription-response_format
#web GPU(类似 waifu4x https://real-cugan.animesales.xyz/) https://huggingface.co/spaces/Xenova/whisper-web
https://huggingface.co/spaces/aadnk/whisper-webui
>#china #ai 代购我不知道,API黄牛倒是不少,但免费的可以薅
据说 Claude.ai 更好,但也封锁了
国区魅力时刻:https://www.bilibili.com/video/BV1ip421U7Qx 大清苏联重现
如果你没有VISA(不能注册Azure的GPT或ASR/TTS),只有微信,可以用 https://openai-hk.com/?i=25623 这个中间商,起步价10块(有更好的推荐请私我.. 😓)
whisper-1 0.006美元/分钟 0.0426人民币 每段限25MB
网页版 https://github.com/openai/whisper/discussions/1018
和离线版是一样的
主要是Whisper的听说性价比在目前能算SOTA吧,而且能顺带充一个GPT4和midj(1个问题3毛钱啊)
可以在 ChatHub.gg (Alt+J) 使用代理商的API,各种 client webui, 本地GLM也都能用
我看了一下,淘宝有卖apikey的(openai不让搜了),gpt3 一口价5块,无需帐号。 也能用Whisper服务 🥰
单靠官方APIkey 只能限权,不能量贩,除了虚拟手机卡注册,这大概是公用号.. ( 我当时也很惊讶gpt4可以直接卖key) 现在gpt3栏大街了,我把sk贴出来:
ssk - v4SRWkhYGKu7EoUeIMJuT3BlbkFJFii1vkU3ZCdO7dyoADrG
我当时也很惊讶gpt4可以直接卖key
https://boehs.org/node/everything-i-know-about-the-xz-backdoor
https://twitter.com/Blankwonder/status/1773921956615877110
发现后门的人还只是一个 pg 社区的开发者
#tool srt 我来示范一下怎么获取全集字幕吧,《国宝特工》
>准备ASR服务+降噪优化
pip install -U openai-whisper demucs
>下载视频全P
yt-dlp -f0 https://www.bilibili.com/video/BV1Rx411j7aw?p={1..52}
>ffmpeg concat all m4a, start seek 110s, to wav 16k
audcat() { for f in `find *.$1|sort -n`; do echo -e "file '$f' \ninpoint $2"; done>ALL.txt; ffmpeg -f concat -safe 0 -i ALL.txt -ac 1 -ar 16000 -sample_fmt s16 -y ALL.wav; }
audcat m4a 110s
>去BGM
#demucs --two-stems vocals ALL.wav; mv separated/*/vocals.wav ALL.wav
#因为配置低就不示范了
>20M一个文件夹(24min一集,建议5集一批), 用同样原理把{1..6}/a.srt 收集回来
find *.m4a|sort -n|pr -aT -10|nl|xargs -L1 ruby -e'puts "mv "+ARGV.rotate.join(" ")'
mkdir {1..6}
for f in `find * -type d`; do cd $f; audcat m4a 110s; cd ..; done
#find -name *.m4a -exec mv {} $PWD \;
>识别生成中文srt
whisper --model base --language Mandarin ALL.wav
>去除回音代表的行
>transform srt file a (using srt.parse), iterate over a[], when a[i] not in a[i-1:i-N], keep it
for i in {1..11};do echo>>a "# $((i*5))前的5集";srt-process -f print --input ep$i.srt >>a; done
import srt,sys
# Set N value for checking against previous N lines
N = 2
_,srt_file=sys.argv
subs = list(srt.parse(open(srt_file)))
noDup=lambda cap,i: cap.content not in [c.content for c in subs[i-N:i] ]
filtered_captions = [cap for i,cap in enumerate(subs) if noDup(cap,i)]
# Save the result to a new SRT file
with open("ep" + srt_file, "w") as f:
f.write(srt.compose(filtered_captions))
https://pypi.org/project/SpeechRecognition/ 也可以用Azure等云服务
最后,如果能买API Key 的话可以直接用
curl https://api.openai.com/v1/audio/transcriptions \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: multipart/form-data" \
-F file="@/path/to/file/audio.mp3" \
-F "timestamp_granularities[]=segment" \
-F model="whisper-1" \
-F response_format="srt"
https://platform.openai.com/docs/api-reference/audio/createTranscription#audio-createtranscription-response_format
#web GPU(类似 waifu4x https://real-cugan.animesales.xyz/) https://huggingface.co/spaces/Xenova/whisper-web
https://huggingface.co/spaces/aadnk/whisper-webui
>#china #ai 代购我不知道,API黄牛倒是不少,但免费的可以薅
据说 Claude.ai 更好,但也封锁了
国区魅力时刻:https://www.bilibili.com/video/BV1ip421U7Qx 大清苏联重现
如果你没有VISA(不能注册Azure的GPT或ASR/TTS),只有微信,可以用 https://openai-hk.com/?i=25623 这个中间商,起步价10块(有更好的推荐请私我.. 😓)
whisper-1 0.006美元/分钟 0.0426人民币 每段限25MB
curl https://api.openai-hk.com/v1/audio/transcriptions -H "Authorization: Bearer $OPENAI_API_KEY" -H "Content-Type: multipart/form-data" -F file="@ALL.mp3" -F "timestamp_granularities[]=segment" -F model="whisper-1" -F response_format="srt"
网页版 https://github.com/openai/whisper/discussions/1018
和离线版是一样的
主要是Whisper的听说性价比在目前能算SOTA吧,而且能顺带充一个GPT4和midj(1个问题3毛钱啊)
可以在 ChatHub.gg (Alt+J) 使用代理商的API,各种 client webui, 本地GLM也都能用
我看了一下,淘宝有卖apikey的(openai不让搜了),gpt3 一口价5块,无需帐号。 也能用Whisper服务 🥰
单靠官方APIkey 只能限权,不能量贩,除了虚拟手机卡注册,这大概是公用号.. ( 我当时也很惊讶gpt4可以直接卖key) 现在gpt3栏大街了,我把sk贴出来:
OPENAI_API_KEY= ssk - X6muuhftCItNxCFy6bP7T3BlbkFJlkcCKxV8Js2v3mnoCAtZ
ssk - v4SRWkhYGKu7EoUeIMJuT3BlbkFJFii1vkU3ZCdO7dyoADrG
我当时也很惊讶gpt4可以直接卖key
PyPI
SpeechRecognition
Library for performing speech recognition, with support for several engines and APIs, online and offline.
#learn #web #tool spider
以yt-dlp 为首的刺头,支持 e-hentai.org 什么的
https://github.com/yt-dlp/yt-dlp
https://github.com/iawia002/lux?tab=readme-ov-file#supported-sites
https://github.com/KurtBestor/Hitomi-Downloader
https://tttttt.me/dsuse/19360 😅直接从缓存下图片,ImageAssistant
https://github.com/nilaoda/N_m3u8DL-RE
https://jeanslack.github.io/Videomass/screenshots.html
特选零代码的
https://github.com/VinciGit00/Scrapegraph-ai/blob/main/examples/openai/json_scraper_openai.py
https://github.com/NaiboWang/EasySpider
https://www.octoparse.com/download
https://www.parsehub.com/quickstart
还有两个付费/仅关键词的
https://github.com/harismuneer/Ultimate-Social-Scrapers
https://github.com/alirezamika/autoscraper
以yt-dlp 为首的刺头,支持 e-hentai.org 什么的
https://github.com/yt-dlp/yt-dlp
https://github.com/iawia002/lux?tab=readme-ov-file#supported-sites
https://github.com/KurtBestor/Hitomi-Downloader
https://tttttt.me/dsuse/19360 😅直接从缓存下图片,ImageAssistant
https://github.com/nilaoda/N_m3u8DL-RE
https://jeanslack.github.io/Videomass/screenshots.html
特选零代码的
https://github.com/VinciGit00/Scrapegraph-ai/blob/main/examples/openai/json_scraper_openai.py
https://github.com/NaiboWang/EasySpider
https://www.octoparse.com/download
https://www.parsehub.com/quickstart
还有两个付费/仅关键词的
https://github.com/harismuneer/Ultimate-Social-Scrapers
https://github.com/alirezamika/autoscraper