自動提示LLM-Text與LLM-Vision的應用指南

更新 發佈閱讀 26 分鐘

Quick Links

  • Auto prompt by LLM and LLM-Vision (Trigger more details out inside model)
    • SD-WEB-UI: https://github.com/xlinx/sd-webui-decadetw-auto-prompt-llm
    • ComfyUI: https://github.com/xlinx/ComfyUI-decadetw-auto-prompt-llm
  • Auto msg to ur mobile (LINE | Telegram | Discord)
    • SD-WEB-UI :https://github.com/xlinx/sd-webui-decadetw-auto-messaging-realtime
    • ComfyUI: https://github.com/xlinx/ComfyUI-decadetw-auto-messaging-realtime
  • I'm SD-VJ. (share SD-generating-process in realtime by gpu)
    • SD-WEB-UI: https://github.com/xlinx/sd-webui-decadetw-spout-syphon-im-vj
    • ComfyUI: https://github.com/xlinx/ComfyUI-decadetw-spout-syphon-im-vj
  • CivitAI Info|discuss:
    • https://civitai.com/articles/6988/extornode-using-llm-trigger-more-detail-that-u-never-thought
    • https://civitai.com/articles/6989/extornode-sd-image-auto-msg-to-u-mobile-realtime
    • https://civitai.com/articles/7090/share-sd-img-to-3rd-software-gpu-share-memory-realtime-spout-or-syphon

SD-WEB-UI | ComfyUI | decadetw-Auto-Prompt-LLM-Vision


 

 

    

Update Log

  • [add|20240730] | 🟢 LLM Recursive Prompt
  • [add|20240730] | 🟢 Keep ur prompt ahead each request
  • [add|20240731] | 🟢 LLM Vision
  • [add|20240803] | 🟢 translateFunction
    • When LLM answered, use LLM translate result to your favorite language.ex: Chinese. It's just for your reference, which won't affect SD.
  • [add|20240808] | 🟠 Before and After script | exe-command
  • [add|20240808] | 🟠 release LLM VRAM everytimes

Motivation💡

  • Call LLM : auto prompt for batch generate images
  • Call LLM-Vision: auto prompt for batch generate images
  • Image will get more details that u never though before.
  • prompt detail is important

Usage

LLM-Text

  • batch image generate with LLM
    • a story
  • Using Recursive prompt say a story with image generate
  • Using LLM
    • when generate forever modeexample as follows figure Red-box.just tell LLM who, when or whatLLM will take care details.
    • when a story-board mode (You can generate serial image follow a story by LLM context.)its like comic booka superstar on stageshe is singingpeople give her flowera fashion men is walking.

LLM-Vision 👀

  • batch image generate with LLM-Vision
    • let LLM-Vision see a magazine
    • see series of image
    • see last-one-img for next-image
    • make a serious of image like comic

Before and After script

  • support load script or exe-command Before-LLM and After-LLM
  • javascript fetch POST method (install Yourself )
    • security issue, but u can consider as follows
    • https://github.com/pmcculler/sd-dynamic-javascript
    • https://github.com/ThereforeGames/unprompted
    • https://github.com/adieyal/sd-dynamic-prompts
    • https://en.wikipedia.org/wiki/Server-side_request_forgery
    • and Command Line Arg --allow-code

[🟢] stable-diffusion-webui-AUTOMATIC1111[🟢] stable-diffusion-webui-forge[🟢] ComfyUI1. SD-Prompt ✦1girl2.1 LLM-Text ✦2.2 LLM-Vision ✦a super star on stage.Who is she in image?2.3 LLM-Text-sys-prompt ✦2.4 LLM-Vision-sys-prompt ✦You are an AI prompt word engineer. Use the provided keywords to create a beautiful composition. Only the prompt words are needed, not your feelings. Customize the style, scene, decoration, etc., and be as detailed as possible without endings.You are an AI prompt word engineer. Use the provided image to create a beautiful composition. Only the prompt words are needed, not your feelings. Customize the style, scene, decoration, etc., and be as detailed as possible without endings.3. LLM will answer other detail ✦The superstar, with their hair flowing in the wind, stands on the stage. The lights dance around them, creating a magical moment that fills everyone present with awe. Their eyes shine bright, as if they are ready to take on the world.The superstar stands tall in their sparkling costume, surrounded by fans who chant and cheer their name. The lights shine down on them, making their hair shine like silver. The crowd is electric, every muscle tense, waiting for the superstar to perform4. Main Interface | sd-web-ui | ComfyUIComfyUI Manager | search keyword: auto

Usage

InputOutputLLM-Text: a superstar on stage.LLM-Vision: What's pose in this image?.(okay, its cool.)LLM-Text: a superstar on stage.LLM-Vision: with a zebra image(okie, cooool show dress. At least we don't have half zebra half human.)LLM-Text: a superstar on stage.(okay, its cool.)LLM: a superstar on stage.(Wow... the describe of light is great.)LLM: a superstar on stage.(hnn... funny, it does make sense.)CHALLENGELLM-vision:A Snow White girl walk in forest.(detect ur LLM-Vision Model IQ; if u didnt get white dress and lot of snow.... plz let me know model name)SD model: Flux.1 DLLM model: llava-llama-3.1-8bLLM model: Eris_PrimeV4-Vision-32k-7B-IQ3_XXS FLUX modelhnn...NSFW show. I'm not mean that, but not a wrong answer.(Trigger more details; that u never thought about it.)SD model: Flux.1 DLLM model: llava-llama-3.1-8bLLM model: Eris_PrimeV4-Vision-32k-7B-IQ3_XXSadvanced use | before-after-actionin fact, u can run any u want script | (storyboard) | random read line from txt send into LLMSpecial LLM LoopConnect 1st LLM-Text output to 2nd LLM-Text Input Special LLM Loop - keep each feature assign to different obj not mix it on one.LLM-Text output ask looply : here     [new tool 20240915] Civitai Prompt Grabberquick prompt from civitai. u can pick some prompt from another area model(ex indoor design or building model) with ur 1girl, ex: 1girl(up figure)) + in-door design model-prompt. then u will get full detail in background(bottom figure) : https://civitai.com/models/85691this is good present in FLUX model. trigger more detail in background. Make the photo getting more realistic feelingoption1. just quick append prompt from other model from civitai oroption2. of course u can send it into LLM too. [update] LLM-ask-LLM🌀[support] Cloud Service: Gemini Procloud service: https://generativelanguage.googleapis.com/v1model: gemini-1.5-flash (vision)support text and visionit will get more🌀 and more🌀 and more🌀 like....(bottom to top)

Usage Tips

  • tips1:
    • leave only 1 or fewer keyword(deep inside CLIP encode) for SD-Prompt, others just fitting into LLM
    • SD-Prompt: 1girl, [xxx,]<--(the keyword u use usually, u got usually image)
    • LLM-Prompt: xxx, yyy, zzz, <--(move it to here; trigger more detail that u never though.)
  • tips2:
    • leave only 1 or fewer keyword(deep inside CLIP encode) for SD-Prompt, others just fit into LLM
    • SD-Prompt: 1girl,
    • LLM-Prompt: a superstar on stage. <--(say a story)
  • tips3:
    • action script - Beforerandom/series pick prompt txt file random line fit into LLM-Text [read_random_line.bat]random/series pick image path file fit into LLM-Vision
    • action script - Afteru can call what u want commandex: release LLM VRAM each call: "curl http://localhost:11434/api/generate -d '{"model": "llama2", "keep_alive": 0}'" @Pdonorex: bra bra. Interactive anything.
  • tipsX: Enjoy it, inspire ur idea, and tell everybody how u use this.

Installtion

  • You need install LM Studio or ollama first.
    • LM Studio: Start the LLM service on port 1234. (suggest use this one)
    • ollama: Start service on port 11434 .
  • Pick one language model from under list
    • text base(small ~2G)
    • text&vision base(a little big ~8G)
  • Start web-ui or ComfyUI install extensions or node
    • stable-diffusion-webui | stable-diffusion-webui-forge:go Extensions->Available [official] or Install from URLhttps://github.com/xlinx/sd-webui-decadetw-auto-prompt-llm
    • ComfyUI: using Manager install nodeManager -> Customer Node Manager -> Search keyword: autohttps://github.com/ltdrdata/ComfyUI-Managerhttps://registry.comfy.org/https://ltdrdata.github.io/
  • Open ur favorite UI
    • Lets inactive with LLM. go~
    • trigger more detail by LLM

Suggestion software info list


Suggestion LLM Model

  • LLM-text (normal, chat, assistant)
    • 4B VRAM<2GCHE-72/Qwen1.5-4B-Chat-Q2_K-GGUF/qwen1.5-4b-chat-q2_k.ggufhttps://huggingface.co/CHE-72/Qwen1.5-4B-Chat-Q2_K-GGUF
    • 7B VRAM<8Gccpl17/Llama-3-Taiwan-8B-Instruct-GGUF/Llama-3-Taiwan-8B-Instruct.Q2_K.ggufLewdiculous/L3-8B-Stheno-v3.2-GGUF-IQ-Imatrix/L3-8B-Stheno-v3.2-IQ3_XXS-imat.gguf
    • Google-Gemmahttps://huggingface.co/bartowski/gemma-2-9b-it-GGUFbartowski/gemma-2-9b-it-GGUF/gemma-2-9b-it-IQ2_M.ggufsmall and good for SD-Prompt
  • LLM-vision 👀 (work with SDXL, VRAM >=8G is better )
    • https://huggingface.co/xtuner/llava-phi-3-mini-ggufllava-phi-3-mini-mmproj-f16.gguf (600MB,vision adapter)⭐⭐⭐llava-phi-3-mini-f16.gguf (7G, main model)
    • https://huggingface.co/FiditeNemini/Llama-3.1-Unhinged-Vision-8B-GGUFllava-llama-3.1-8b-mmproj-f16.gguf⭐⭐⭐Llama-3.1-Unhinged-Vision-8B-Q8.0.gguf
    • https://huggingface.co/Lewdiculous/Eris_PrimeV4-Vision-32k-7B-GGUF-IQ-Imatrix#quantization-informationquantization_options = ["Q4_K_M", "Q4_K_S", "IQ4_XS", "Q5_K_M", "Q5_K_S","Q6_K", "Q8_0", "IQ3_M", "IQ3_S", "IQ3_XXS"]⭐⭐⭐⭐⭐for low VRAM super small: IQ3_XXS (2.83G)in fact, it's enough uses.

Using Online LLM Service Setup example

OpenAI ChatGPT

  • In Auto-LLM Setup tab
    • LLM-URL=https://api.openai.com/v1
  • get ur api key from openAI : https://platform.openai.com/api-keys
    • LLM-API-KEY = xxxxxxxxxxxxxxxxxxxxxxx
    • LLM-Model-Name = gpt-3.5-turbo

Google Gemini

X Grok

claude.ai

Hugging face space

Javascript!

security issue, but u can consider as follows.

Buy me a Coca cola ☕

https://buymeacoffee.com/xxoooxx

Colophon

Made for fun. I hope if brings you great joy, and perfect hair forever. Contact me with questions and comments, but not threats, please. And feel free to contribute! Pull requests and ideas in Discussions or Issues will be taken quite seriously! --- https://decade.tw

留言
avatar-img
YS Lin的沙龍
0會員
1內容數
你可能也想看
Thumbnail
1.創建您的 AI:http:// wnr.ai 2.會議摘要: http:// tldv.io 3.您的搜索助手:http:// perplexity.ai 4.輕鬆創建演示文稿: http:// presentations.ai
Thumbnail
1.創建您的 AI:http:// wnr.ai 2.會議摘要: http:// tldv.io 3.您的搜索助手:http:// perplexity.ai 4.輕鬆創建演示文稿: http:// presentations.ai
Thumbnail
本文介紹了text-generation-webui的安裝方法和模型的選擇,包括模型的下載和擺放位置,並提供了相關的連結和建議。
Thumbnail
本文介紹了text-generation-webui的安裝方法和模型的選擇,包括模型的下載和擺放位置,並提供了相關的連結和建議。
Thumbnail
《轉轉生》(Re:INCARNATION)為奈及利亞編舞家庫德斯.奧尼奎庫與 Q 舞團創作的當代舞蹈作品,結合拉各斯街頭節奏、Afrobeat/Afrobeats、以及約魯巴宇宙觀的非線性時間,建構出關於輪迴的「誕生—死亡—重生」儀式結構。本文將從約魯巴哲學概念出發,解析其去殖民的身體政治。
Thumbnail
《轉轉生》(Re:INCARNATION)為奈及利亞編舞家庫德斯.奧尼奎庫與 Q 舞團創作的當代舞蹈作品,結合拉各斯街頭節奏、Afrobeat/Afrobeats、以及約魯巴宇宙觀的非線性時間,建構出關於輪迴的「誕生—死亡—重生」儀式結構。本文將從約魯巴哲學概念出發,解析其去殖民的身體政治。
Thumbnail
本文分析導演巴里・柯斯基(Barrie Kosky)如何運用極簡的舞臺配置,將布萊希特(Bertolt Brecht)的「疏離效果」轉化為視覺奇觀與黑色幽默,探討《三便士歌劇》在當代劇場中的新詮釋,並藉由舞臺、燈光、服裝、音樂等多方面,分析該作如何在保留批判核心的同時,觸及觀眾的觀看位置與人性幽微。
Thumbnail
本文分析導演巴里・柯斯基(Barrie Kosky)如何運用極簡的舞臺配置,將布萊希特(Bertolt Brecht)的「疏離效果」轉化為視覺奇觀與黑色幽默,探討《三便士歌劇》在當代劇場中的新詮釋,並藉由舞臺、燈光、服裝、音樂等多方面,分析該作如何在保留批判核心的同時,觸及觀眾的觀看位置與人性幽微。
Thumbnail
Quick Links Auto prompt by LLM and LLM-Vision (Trigger more details out inside model) SD-WEB-UI: https://github.com/xlinx/sd-webui-decadetw-auto-pro
Thumbnail
Quick Links Auto prompt by LLM and LLM-Vision (Trigger more details out inside model) SD-WEB-UI: https://github.com/xlinx/sd-webui-decadetw-auto-pro
Thumbnail
本文章介紹瞭如何使用AutoGPT, 一種可以自主蒐集資料生成訴求,幫你與ChatGPT在互動中提出一連串的問題,來解決你的問題。對於安裝時的常見問題也進行了解答,並提供了使用的步驟以及目前的解決方式。
Thumbnail
本文章介紹瞭如何使用AutoGPT, 一種可以自主蒐集資料生成訴求,幫你與ChatGPT在互動中提出一連串的問題,來解決你的問題。對於安裝時的常見問題也進行了解答,並提供了使用的步驟以及目前的解決方式。
Thumbnail
這堂課闡述開發ChatGPT所需的重要概念和工具。涵蓋語言模型如何處理文字(Token),LLM的兩種類型(Base LLM和Instruction tuned LLM),系統、助手和用戶的角色定義。並介紹以Prompting簡化AI開發流程,且透過實戰教學說明如何進行分類和預防注入提示
Thumbnail
這堂課闡述開發ChatGPT所需的重要概念和工具。涵蓋語言模型如何處理文字(Token),LLM的兩種類型(Base LLM和Instruction tuned LLM),系統、助手和用戶的角色定義。並介紹以Prompting簡化AI開發流程,且透過實戰教學說明如何進行分類和預防注入提示
Thumbnail
本文介紹了大型語言模型(LLM)中Prompt的原理及實踐,並提供了撰寫Prompt的基本框架邏輯PREP,以及加強Prompt撰寫的幾個方向:加強說明背景、角色描述和呈現風格,加強背景說明,角色描述,呈現風格以及目標受眾(TA)。同時推薦了幾個Prompt相關的參考網站。最後解答了一些快問快答。
Thumbnail
本文介紹了大型語言模型(LLM)中Prompt的原理及實踐,並提供了撰寫Prompt的基本框架邏輯PREP,以及加強Prompt撰寫的幾個方向:加強說明背景、角色描述和呈現風格,加強背景說明,角色描述,呈現風格以及目標受眾(TA)。同時推薦了幾個Prompt相關的參考網站。最後解答了一些快問快答。
Thumbnail
這是一場修復文化與重建精神的儀式,觀眾不需要完全看懂《遊林驚夢:巧遇Hagay》,但你能感受心與土地團聚的渴望,也不急著在此處釐清或定義什麼,但你的在場感受,就是一條線索,關於如何找著自己的路徑、自己的聲音。
Thumbnail
這是一場修復文化與重建精神的儀式,觀眾不需要完全看懂《遊林驚夢:巧遇Hagay》,但你能感受心與土地團聚的渴望,也不急著在此處釐清或定義什麼,但你的在場感受,就是一條線索,關於如何找著自己的路徑、自己的聲音。
Thumbnail
本文詳述如何將大型語言模型(LLM)與程式碼深度整合,運用於3C賣場的客服助理示例,透過接收並解析使用者訊息,提取產品資訊,並與後端產品資料庫整合。接著,將整合資訊回傳給LLM生成最終回應訊息。同時,也探討了中英文理解差距及解決方法,並展示如何利用Python模擬資料庫提取詳細資訊。
Thumbnail
本文詳述如何將大型語言模型(LLM)與程式碼深度整合,運用於3C賣場的客服助理示例,透過接收並解析使用者訊息,提取產品資訊,並與後端產品資料庫整合。接著,將整合資訊回傳給LLM生成最終回應訊息。同時,也探討了中英文理解差距及解決方法,並展示如何利用Python模擬資料庫提取詳細資訊。
Thumbnail
你是不是每次用 ChatGPT 都要重新解釋一次需求?但其實,你可以 提前寫好專屬 Prompt(提示詞),讓 AI 一秒進入狀態,不用每次重頭開始!所以那個 Prompt(提示詞) 該怎麼寫?進來看看吧!
Thumbnail
你是不是每次用 ChatGPT 都要重新解釋一次需求?但其實,你可以 提前寫好專屬 Prompt(提示詞),讓 AI 一秒進入狀態,不用每次重頭開始!所以那個 Prompt(提示詞) 該怎麼寫?進來看看吧!
Thumbnail
背景:從冷門配角到市場主線,算力與電力被重新定價   小P從2008進入股市,每一個時期的投資亮點都不同,記得2009蘋果手機剛上市,當時蘋果只要在媒體上提到哪一間供應鏈,隔天股價就有驚人的表現,當時光學鏡頭非常熱門,因為手機第一次搭上鏡頭可以拍照,也造就傳統相機廠的殞落,如今手機已經全面普及,題
Thumbnail
背景:從冷門配角到市場主線,算力與電力被重新定價   小P從2008進入股市,每一個時期的投資亮點都不同,記得2009蘋果手機剛上市,當時蘋果只要在媒體上提到哪一間供應鏈,隔天股價就有驚人的表現,當時光學鏡頭非常熱門,因為手機第一次搭上鏡頭可以拍照,也造就傳統相機廠的殞落,如今手機已經全面普及,題
追蹤感興趣的內容從 Google News 追蹤更多 vocus 的最新精選內容追蹤 Google News