AI News

Ai News:GLM-4-Voice, Gemini 2.0, Gigapixel 8 and more

0
Please log in or register to do it.

1.GLM-4-Voice

GLM-4-Voice is a special voice model made by Zhipu. It shows emotions in voice responses, making conversations feel real and natural. It works well with different languages, like Chinese and English, and can handle interruptions easily. You can find it on GitHub under THUDM/GLM-4-Voice.

https://github.com/THUDM/GLM-4-Voice

2. Gemini 2.0 by Google

Google plans to release Gemini 2 this December. Although expectations are sky-high, some say it might not be groundbreaking. Still, it’ll be exciting to see what’s in store!

https://www.theverge.com/2024/10/25/24279600/google-next-gemini-ai-model-openai-december

3. Gigapixel 8

If you want to make your photos better, check out Gigapixel 8. It’s great for making faces clearer with its Face Recovery Gen 2 feature, which adds more detail to photos.

https://www.topazlabs.com

4. LVSM Model

LVSM is an advanced model designed for video creation using just a few images. This one’s special because it can handle perspectives and angles more naturally. Although the code isn’t publicly available yet, this model’s flexibility looks promising!

https://haian-jin.github.io/projects/LVSM

5. Bee Agent Framework

The Bee Agent Framework makes building workflows super easy. This free tool helps you create, run, and manage workflows, making it simple to handle tasks automatically.

https://github.com/i-am-bee/bee-agent-framework

6. DRY Sampler

The DRY Sampler is built into llama.cpp. It helps create different text with less repeating, making your content feel new and fresh.

https://github.com/ggerganov/llama.cpp/pull/9702

7. Large Spatial Model (LDM)

With LDM, you can create 3D scenes fast! By using just a few images, it generates 3D models in only 0.1 seconds. It’s speedy and doesn’t even need extra details like camera settings or feature points.

https://largespatialmodel.github.io

https://huggingface.co/spaces/kairunwen/LSM

8. Rodin Gen-1

Rodin Gen-1 is a 3D model generator with new Reinforcement Learning from Human Feedback (RLHF) integration, improving geometry accuracy by 60%. That means more realistic and detailed 3D shapes!

https://www.hyper3d.ai

9. ComfyUI-Disty-Flow

This update, Flow, is a custom node for ComfyUI, making it easier for users to create and manage workflows in image generation. It’s a helpful addition that simplifies the experience.

https://github.com/diStyApps/ComfyUI-disty-Flow

10. Stable Diffusion 3.5 Fine-Tuning Tutorial

This tutorial is perfect for anyone who wants to change how images are made. It shows you step-by-step how to fine-tune Stable Diffusion 3.5, helping you create better large images.

https://stabilityai.notion.site/Stable-Diffusion-3-5-Large-Fine-tuning-Tutorial-11a61cdcd1968027a15bdbd7c40be8c6

ComfyUI Mochi Workflow Text to Video
Ai News: Microsoft OmniParser, Apple Ferret-UI, Meta Llama 3.2 Quantitative Version Released and more

Your email address will not be published. Required fields are marked *