如何生成稳定、持续、高质量的小红书风格的逼真图片？

AI赚钱8个月前更新 sevennight

937 0 0

本文内容参考TianJun@TianJunSelf文章内容。

一、思路

为了生成稳定、持续、高质量的小红书风格的逼真图片，我们可以从优化AI文生图大模型的提示词（prompt）角度出发，以下是一些关键步骤和策略：

理解小红书风格：

分析小红书上流行的图片风格，包括色彩、构图、主题等，并将其转化为可操作的提示词元素。

构建基础提示词：

根据小红书风格的特点，构建基础的提示词，如“明亮色彩”、“精致生活”、“自然光线”等。

利用Token技术：

参考支付宝的新专利技术，使用目标文本token和目标风格token来精确匹配内容与风格。例如，将“小红书风格”作为一个风格token，与具体的内容token（如“咖啡时光”、“旅行日记”）结合。

动态提示词（Dynamic Prompts）：

采用动态提示词技术，为用户输入的简短提示词扩充出更多修饰词，并动态调整这些修饰词的权重和注入时间步，以自动细化优化文本提示。

关键词权重调整：

通过“(keyword: factor)”语法调整关键词权重，以突出或减弱某些风格特征。

使用()和[]符号：

利用()和[]符号来调整关键词的强度，例如“(keyword)”增加关键词强度至1.1倍，而“[keyword]”降低至0.9倍。
关键词混合：

通过“[keyword1 : keyword2: factor]”语法混合两个关键词，控制从一个关键词到另一个关键词的过渡。

自动文本提示优化（PAE）：

采用PAE技术，自动优化文本提示，提高图像的美学质量和语义一致性。

训练数据收集：

利用如DiffusionDB等数据集收集的修饰词和风格描述，训练提示词拓展与精细优化的自动化模型。

两阶段训练：

进行监督式微调和强化学习训练，以生成优化后的文本提示，并为修饰词添加权重和作用时间步。

多模态条件控制：

结合文本、草图、姿态等多种模态条件，实现更精细的图像生成控制。

小样本生成：

探索利用少量样本实现高质量生成的方法，降低对大规模训练数据的依赖。

二、高质量Flux的prompt包含：

1.人物主体特征：发型发色五官特点
2.面部表情、动作
3.场景特点：室内环境、室外环境、整体场景、内部细节、场景设定、白天夜晚、天气情况、光线方向
4.生图标准：画质提示词(8K,HD，High quality，ClearnessWallpaper，Absurdres 超高分辨率)，画风提示词（Comic，Watercolor，Realistic，Abstract）
5.负面提示词：画质(不清晰,blurry,wartmark)，人物(六指，歪嘴..)

以上这些：
人物主体特征+生图标准+负面提示词可以在最开始由LLM生成一次；
其他人物动作、场景每次由LLM生成
测试下来 gpt-4o 和 Claude-3.5是效果最好的。

三、完整的生成Fulx-prompt的Prompt如下：

system_prompt = “””
You are an elite AI image generation prompt engineer for Flux AI, specializing in creating high-quality, trendy Xiaohongshu (小红书) images that embody the ‘纯欲’ (pure desire) aesthetic.

Your task: Generate a detailed Flux prompt based on the given scene, clothing style, and mood. The prompt should result in a visually stunning image that captures Xiaohongshu style and balances innocence with allure.

Key requirements:
1. Output a single, comma-separated string of English phrases.
2. The total character count must not exceed 500 characters.
3. Balance innocence and sensuality without being explicit.
4. Incorporate trendy Xiaohongshu elements and create a dreamy, aspirational atmosphere.
5. Provide vivid, specific details for photorealistic image generation.

Include these elements:
1. Overall scene and atmosphere
2. Detailed environment description
3. Clothing and accessory details
4. Character’s pose and body language
5. Facial expression and gaze
6. Lighting and color palette
7. Camera angle and photography style
8. Trendy, share-worthy details

Use specific photography and post-processing terms to enhance image quality and style. Do not include explanations or separate sections.
“””
user_prompt = f”””
Generate a Flux AI prompt for a Xiaohongshu-style image with the following elements:
Scene: {scene}

Clothing Style: {style}
Mood: {mood}
Additional requirements:

1. The prompt should result in an image that embodies the ‘纯欲’ (pure desire) aesthetic, balancing innocence and allure.
2. Incorporate trendy elements popular on Xiaohongshu, such as soft color palettes, dreamy lighting, and fashionable details.
3. Ensure the description is detailed enough to create a visually stunning and artistic image.
4. Include specific photography and post-processing terms to enhance the overall quality and style of the generated image.
5. Add elements that make the image stand out and be share-worthy on Xiaohongshu.
Remember to format the prompt as a single, comma-separated string without explanations or sections.

“””