Google Gemini, a multimodal AI by DeepMind, processes text, audio, images, and more. Gemini outperforms in AI benchmarks, is optimized for varied devices, and has been tested for safety and bias, adhering to responsible AI practices.
87 AI tools found
Explore our extensive collection of AI research papers and machine learning manuscripts (LLMs) from top academics and industry experts. Delve into the latest findings, breakthroughs, and peer-reviewed articles of 2023, providing a deep understanding of the ever-evolving AI landscape.
Google Gemini, a multimodal AI by DeepMind, processes text, audio, images, and more. Gemini outperforms in AI benchmarks, is optimized for varied devices, and has been tested for safety and bias, adhering to responsible AI practices.
Video ReTalking, advanced real-world talking head video according to input audio, producing a high-quality
Then transplant it to the real world to solve complex problems
LongLLaMA is a large language model designed to handle very long text contexts, up to 256,000 tokens. It's based on OpenLLaMA and uses a technique called Focused Transformer (FoT) for training. The repository provides a smaller 3B version of LongLLaMA for free use. It can also be used as a replacement for LLaMA models with shorter contexts.
Large Language and Vision Assistant
Use bank data and Ntropy's AI. Parse bank feeds and statements, extract revenue and COGs, automatically re-create a P&L within milliseconds. Any industry, any geo.
Carbon is a unified API to connect external data to your vector databases. Build better, personalized AI applications.
LAMA utilizes a reinforcement learning framework combined with a motion matching algorithm. Reinforcement learning helps the model make appropriate decisions in various scenarios, while motion matching algorithms ensure that synthesized actions match real human actions. In addition, LAMA also utilizes the motion editing framework of manifold learning to cover various possible changes in interactions and operations.
The LLM platform to help you create AI-powered products that amaze your customers
You can train your digital twin model and generate photos through FaceChain's Python script or the familiar Gradio interface, or you can experience FaceChain directly through ModelScope Studio.
Bruinen helps companies build user-facing apps by making it easy, safe, and reliable to connect to user accounts wherever they exist online.
Replicate – Run open-source machine learning models with a cloud API
Cerelyze - Enabling engineers to rapidly reproduce scientific research
However, due to the unavailability of experts in these locations, the data has to be transferred to an urban healthcare facility (AMD and glaucoma) or a terrestrial station (e. g, SANS) for more precise disease identification.
Eosinophilic Esophagitis (EoE) is a chronic, immune/antigen-mediated esophageal disease, characterized by symptoms related to esophageal dysfunction and histological evidence of eosinophil-dominant inflammation.
Event-based motion deblurring has shown promising results by exploiting low-latency events.
MS3D++ provides a straightforward approach to domain adaptation by generating high-quality pseudo-labels, enabling the adaptation of 3D detectors to a diverse range of lidar types, regardless of their density.
Granger causal inference is a contentious but widespread method used in fields ranging from economics to neuroscience.
Efficient RGB-D semantic segmentation has received considerable attention in mobile robots, which plays a vital role in analyzing and recognizing environmental information.
For the at most one change point problem, we propose the use of a conceptor matrix to learn the characteristic dynamics of a specified training window in a time series.
To balance efficiency and effectiveness, the vast majority of existing methods follow the two-pass approach, in which the first pass samples a fixed number of unobserved items by a simple static distribution and then the second pass selects the final negative items using a more sophisticated negative sampling strategy.
Our codec demonstrates the potential of specialized codecs for machine analysis of point clouds, and provides a basis for extension to more complex tasks and datasets in the future.
Deep neural networks are vulnerable to universal adversarial perturbation (UAP), an instance-agnostic perturbation capable of fooling the target model for most samples.
RFL means that recommender system can only receive feedback on exposed items from users and update recommender models incrementally based on this feedback.
The massive successes of large language models (LLMs) encourage the emerging exploration of LLM-augmented Autonomous Agents (LAAs).
Furthermore, we identify the aspects of deductive reasoning ability on which deduction corpora can enhance LMs and those on which they cannot.
Denoising Diffusion Models (DDM) are emerging as the cutting-edge technology in the realm of deep generative modeling, challenging the dominance of Generative Adversarial Networks.
Objective and subjective evaluations show that \\textit{Phoneme Hallucinator} outperforms existing VC methods for both intelligibility and speaker similarity.