Hello everyone! Today’s AI news is with exciting updates.
PrintMon Maker
This new service lets you turn text or images into 3D-printable models. It’s not just 3D graphics—it creates STL files (the type 3D printers use). We’ve come a long way! While 3D graphics and printable models might look similar, they are quite different in how they work. Converting between the two is tricky and often requires a lot of fixing.
https://forum.bambulab.com/t/introducing-makerlab-printmon-maker/95537
F5-TTS
F5-TTS is the latest open-source model that turns text into speech. It’s faster than older models and makes more natural, expressive voices. Unlike before, you don’t need complex setups to use it—pretty straightforward!
https://huggingface.co/papers/2410.06885
https://github.com/SWivid/F5-TTS
https://huggingface.co/SWivid/F5-TTS
SwarmUI 0.9.3
SwarmUI is a user-friendly interface for generating images using AI models like Stable Diffusion and Flux. It’s designed for both beginners and advanced users, and soon, it will support video and audio generation too. It’s still in beta, but it’s already looking promising!
https://github.com/mcmonkeyprojects/SwarmUI
FLORA
FLORA is a platform where you can access many advanced AI models in one editor. It’s not just for still images—you can also create videos, upscale images, and design your own workflows with an intuitive interface.
Ovis1.6-Gemma2-9B
This lightweight yet powerful vision model works well with the Japanese language too. You can ask questions in Japanese, and it responds in Japanese, which is great for local users.
https://huggingface.co/spaces/AIDC-AI/Ovis1.6-Gemma2-9B
https://github.com/AIDC-AI/Ovis
https://github.com/AIDC-AI/Ovis
See2Sound
See2Sound generates spatial audio to match images, animations, or videos. This means the audio changes in direction and depth, making it sound more realistic, especially on a surround sound system.
https://github.com/see2sound/see2sound
https://huggingface.co/spaces/jadechoghari/see-2-sound
FLUX.1-Turbo-Alpha
This update is for the FLUX.1-dev model, using an 8-step LoRA distillation process. It makes the model smaller and faster while boosting performance—a win-win!
https://huggingface.co/alimama-creative/FLUX.1-Turbo-Alpha
v0 Latest Update
v0 now edits files up to five times faster by only scanning for the changes that need to be made. It’s a huge time saver!
SimpleTuner v1.1.2
SimpleTuner, the fine-tuning tool for Flux models, has just been updated. It now supports mask loss training, similar to tools like OneTrainer and Kohya. There’s also a new normalization technique, which was introduced in the Dreambooth guide.
https://github.com/bghira/SimpleTuner
Pyramidflow on ComfyUI
Pyramidflow is a high-performance video generation AI, and now it can be controlled through ComfyUI with a custom node. However, it requires a fair amount of VRAM (usually 24GB–40GB). Luckily, there are ways to reduce VRAM usage, and this is one of them.
https://github.com/AIFSH/PyramidFlow-ComfyUI
Creative AI Examples
Lastly, here’s an impressive use of AI that’s been buzzing around X (formerly Twitter). Someone created a video using a photo taken on their iPhone, with Luma AI’s image-to-video generation technology.
That’s all for today’s AI news!