innovation marketplace

TECH OFFERS

Discover new technologies by our partners

Leveraging our wide network of partners, we have curated numerous enabling technologies available for licensing and commercialisation across different industries and domains. Our focus also extends to emerging technologies in Singapore and beyond, where we actively seek out new technology offerings that can drive innovation and accelerate business growth.

By harnessing the power of these emerging technologies and embracing new technology advancements, businesses can stay at the forefront of their fields. Explore our technology offers and collaborate with partners of complementary technological capabilities for co-innovation opportunities. Reach out to IPI Singapore to transform your business with the latest technological advancements.

Synthetically-generated Privacy-preserving Data for Machine Learning
Artificial Intelligence/Machine Learning (AI/ML) performance is predicated on training with good quality data. However, such data is often difficult to acquire due to ethical concerns, logistic problems, high cost, data bias, and inherent poor data quality. Privacy restrictions and data regulations further compound the problem of data acquisition, restricting many organisations long-term access to valuable historical data. Ultimately, this creates the problem of incomplete or biased data which degrade the overall performance of trained AI/ML models.   This technology offer is a controlled synthetic data generation with differential privacy capability for structured (tabular) data. Its synthetic data engine utilizes conditional GANs (cGANs) coupled with optional differential privacy to synthesize data with similar properties as real data without the associated privacy risks. The core technology is a synthetic data engine that learns the distribution of the input data and selects the column to generate based on this distribution. Gaussian noise is further added to the gradients to protect the privacy of the data. The technology can generate data quickly: 10,000 rows, 8 columns in 8 minutes (evaluated on Nvidia GTX1080) and is mainly intended to generate synthetic datasets to address data scarcity, data privacy, and data augmentation. This generative process involves the following features: Conditional Generative Adversarial Networks (cGANS) generate synthetic data that mimic real data Sensitive data is obfuscated with statistical noise and randomization Definable privacy levels allowing adjustability between utility and data privacy (Differential privacy allows Machine Learning models to be trained on synthetic tabular data and achieve similar results as models trained on real data) Quality Assurance (QA) component generates reports to aid the assessment of data quality and risk metrics APIs for rapid integration, with full customisability This technology can be used for the following types of structured data: non-time series time series multi-tables free-text fields It can be applied in the following use-cases: Data Augmentation Increase the size of your datasets without wasting time to procure new data Data Extrapolation Extrapolate known data to generate unavailable or unknown data points Bias Correction De-bias or equalize the distribution of datasets Targeted Generation Generate rich data, including infrequent scenarios This synthetic data generation with differential privacy technology provides accessible privacy by design - adding privacy-preserving techniques before, during or after AI training, together with the following benefits: Synthetic data does not require further data sanitization, providing a safe data sandbox environment Reduces the need to pay for additional datasets by generating missing data or de-biasing existing datasets Overcomes the challenges of data acquisition by enriching real data with synthetic data through controlled generation Synthetically generated data become your data assets, with potential for monetization as new revenue streams Protect real data by combining made up data points to make it harder to distinguish what is real even if data is compromised Indefinite retention time without associated compliance risks and full accessibility to rich statistical data to provide a boost to AI/ML model resilience and performance The technology owner is looking to collaborate with technology partners in the field of AI/ML to co-develop new products/services, and for collaborators to test-bed in pilot projects. data, generation, privacy preserving, definable privacy, machine learning, synthetic data Infocomm, Security & Privacy, Data Processing
Lixiviant for High Efficiency Extraction of Palladium (Pd) from Electronic Waste
The exponential growth of electronic waste (E-waste) generation is proliferating due to the ever-increasing demand for electrical and electronic equipment (EEE) driven by industrial revolution and development. The COVID-19 crisis has further accelerated the shift towards digital transformation, contributing to an upsurge in E-waste generation. To-date, the industrial practices of extracting palladium (Pd) from electronic waste and mining ores rely on hydrometallurgy techniques using highly corrosive acids, typically aqua regia at elevated temperature. The process poses severe hazards to workers and lead to environmental pollution. Aqua regia’s capability to dissolve many various metals results in low selectivity for Pd. Despite ongoing efforts to develop alternative methods, these methods often prove impractical for industrial adoption. The technology provider has developed a proprietary lixiviant capable of extracting palladium up to 4,000 ppm at saturation with high extraction efficiency and selectivity within 12 hours. This lixiviant is facile, cost-effective, and significantly less corrosive and hazardous compared to current industrial practices. Substituting fuming aqua regia with this lixiviant could enhance the protection of workers and environmental safety. Importantly, the proposed technology is highly compatible with existing hydrometallurgy processes, eliminating the need for companies to change their current infrastructure. An E-waste industry partner has successfully conducted a pilot-scale (5-Litre scale) evaluation, validating the effectiveness and applicability of the lixiviant on their Pd-coated samples. The technology provider is actively seeking industry partners interested in test-bedding and licensing of this technology. Low cyanide concentration (< 50 ppm) stabilized in alkaline solution Optimal operating temperature of 90°C High selectivity (> 86%) and high extraction rate (> 86%) of palladium Cost-effective at ≤ USD 2.12/L extracting up to 4,000 ppm palladium at saturation within 12 hours Easy adoption and high compatibility with existing industrial hydrometallurgy systems Improve workplace safety and health which better protects workers and the environment Electronic wastes, such as Pd-coated connectors, Pd-coated wire bonding, etc. Pd-coated industrial wastes Recovered palladium can be further refined for resale and reuse In recent years, many countries have mandated environmental responsibilities to electronic manufacturers to establish producer recycling programs and ban E-waste disposal into landfills. E-waste contains precious metals, such as palladium, gold and silver that are highly sought-after by E-waste recycling companies due to their scarcity, high value and demand, and are actively traded as commodities over the last decades. The extraction of precious metals from E-waste is not only commercially attractive but also aligns with Corporate Social Responsibility and Environmental, Social, and Governance goals for resource recovery and environmental protection. The global E-waste market size was valued at USD 52.6 billion in 2022 and is expected to expand at a compound annual growth rate (CAGR) of 12.1% from 2022 to 2032, to reach USD 160.2 billion (Market.us, 2023). The proposed technology features a proprietary lixiviant capable of extracting palladium up to 4,000 ppm at saturation with a high extraction efficiency (≥ 89%) and high purity (≥ 92%). This cost-effective lixiviant is significantly less hazardous as compared to current industrial practices, thus better protecting workplace safety and health. Notably, the technology is compatible with existing hydrometallurgy processes and has been successfully verified at pilot-scale (5-Litre) in collaborating with an industry partner. Hydrometallurgy, palladium recovery, palladium extraction, palladium recycling, precious metal recovery, precious metal extraction, precious metal recycling, electronic waste (E-waste) recycling, electronic waste treatment Chemicals, Catalysts, Waste Management & Recycling, Industrial Waste Management
Cost-Effective Protective Coating Enhancing Durability of Electrode Catalyst
Electrolysis has diverse applications across various sectors, such as household and industrial electrolyzed water treatment, soda electrolysis, electrolytic plating, electrodeposition, and hydrogen generation. In electrolysis using insoluble electrodes, the electrocatalyst acting as the reaction field for the electrode reaction undergoes gradual abrasion. Given the high cost of precious metals (i.e., platinum group compounds) used as catalysts, protecting the catalyst and reducing the wear rate are crucial for extending the lifetime of electrodes and reducing the maintenance cost. Current technologies include multilayer electrodes that have a surface layer of noble metal oxide on the electrocatalyst to reduce catalyst wear. However, this method proves more expensive than ordinary insoluble electrodes. Additionally, the surface layer cannot be recoated. To address the challenge, the technology owner has developed a proprietary protective coating that effectively protects the catalyst on the surface of existing insoluble electrodes. This solution enables effective electrode protection through an inexpensive coating, reducing catalyst consumption and electrode replacement frequency. The coating can be reused by recoating the electrode, also contributing to the perspective of “Circular Economy”. The technology owner is seeking R&D collaboration with industrial partners such as electrode manufacturers, coating manufacturers, and companies utilising insoluble electrodes in electrolysis, especially electrolytic plating and metal recovery.  This unique coating, made of special silicone and conductive particles, can be applied to the catalyst surface and cured to reduce catalyst wear. Key features of this technology include: Improved electrode durability: double the replacement interval Excellent chemical resistance: capability to withstand harsh liquids such as strong acids and strong alkalis Optimal performance: good heat resistance, conductivity, and adhesion to the base material Efficient development: shorter development time and lower implementation cost compared to alternative methods such as electrolytic control and diamond coating Cost-effective solution: reduce maintenance cost and utilisation loss in the upstream process of electrolysis Circular economy contribution: reusable by recoating the electrode This technology can be used in handling harsh liquids such as strong acids and strong alkalis, addressing the challenge of electrode durability. It is mainly intended for the recovery of metals through electrolysis, especially targeting aqueous solutions containing metal ions. This is particularly useful for processes such as electrolytic plating and etching effluents in semiconductor manufacturing. In the future, the technology owner is also exploring the potential applications of this technology in water electrolysis electrodes and the use of conductive coatings beyond electrodes. Double the lifetime of the electrode using an inexpensive coating Can be reused by recoating the electrode Reduce the replacement frequency and maintenance cost Adaptable to existing coating (painting) facilities without modification Coatings, Electrode Catalyst, Electrolysis, Metal Recovery, electrolytic plating, recoating, reused Chemicals, Coatings & Paints, Manufacturing, Chemical Processes, Sustainability, Circular Economy
DNA Test Kit for On-site Diagnostics of Tropical Crop Diseases
Fast crop disease management is important to ensure sustainable production. Many tropical crops suffer from infectious diseases that spread and kill plantations. Previously, new land had to be allocated to replant crops in disease-free areas. This is now more challenging because land conversion implies deforestation. Thus, one way to improve the metrics of both production and sustainability is by testing for infection before moving the non-infectious material (i.e. in nurseries). However, as PCR testing in tropical countries is more challenging due to logistics and other factors, testing on-site would be a preferred option. This technology is a unique, portable, self-administered DNA detection kit to be used directly on-site to test for the DNA of the pathogen (virus, fungus etc.). Developed in Switzerland, the technology has already shown one use case for cocoa testing in West Africa and is shipped in the country without a cold chain. The average development time needed to create a custom DNA test kit to fit a specific crop disease is 4 weeks.  The developed kit will encompass sample preparation, isothermal amplification, and detection with a small device capturing data, timestamp and GPS location. Simple to use, with no technical equipment or methods: no spin-columns, no centrifuges, no thermo-cycler, no gel electrophoresis. A non-technician can pick up the skillset in a day. Average hands-on time with the DNA test kit is 5 minutes and results can be obtained within 1 hour. After sample preparation, the device has been shown to be robust enough to tolerate agitation (e.g. moving car in a jungle), minimising waiting time on-site. The technology can be applied in Integrated Pest Management Breeding  - identifying DNA markers for beneficial traits of novel varieties Commodity Trading  - predicting crop harvest quantities, for companies working with tropical crops like cocoa, coffee, tea, rice, banana or cassava. Test data aggregation/intelligence services can also be developed to support efforts in development of novel breeds/varieties, deployment of novel agro-forestry protocols, production of pathogen-free planting material, longer term yield forecast from identification of asymptomatic infected farms and sale of precision chemicals (biostimulants, fertilizers, pesticides) based on test result evidence. The DNA testing technology, in theory, applies also to animal diseases (i.e. aquaculture). This market area has not been developed by the company but can be explored on a preliminary basis.    Paradigm shift in the food industry, with customers expecting to see displays of sustainability, welfare, quality and ethics – to help inform and validate their food choices Global movement to hold food companies accountable for their supply chains has uniquely positioned testing, inspection and certification as a core solution that can be achieved with this technology. According to the Food and Agriculture Organization (FAO), plant diseases cost the global economy around $220 billion each year. There is a significant upside potential to expand the technology capabilities to other identified growth markets Test done on-site in tropical areas (temperature, humidity) without strict cold chain which traditional methods like PCR require, allowing for first mile testing. Early detection of infections in asymptomatic crops reduces the disease impact on production yield and sustainability metrics (less deforestation) Life Sciences, Agriculture & Aquaculture
Generative AI Technology for Business Process Automation and Customer Engagement Improvement
Enterprises are constantly looking for ways to improve operational efficiency and reduce costs. Traditional automation has limitations, especially when it comes to tasks requiring creativity or complex decision-making. Generative AI has emerged as a transformative technology that addresses a variety of pain-points faced by enterprises across industries. This technology solution offers a seamless integration of large language models (LLMs) and Generative AI fuctions with existing infrastructure, enhancing AI's impact by automating the flow of information and standardizing AI usage within your enterprise. This empowers customer support and operations teams to provide quick and accurate responses, significantly improving service delivery and operational efficiency.     This Generative AI technology solution is powered by a combination of technologies and methodologies to ensure a high level of customer engagement, personalization, and efficiency. Here's a breakdown of the key technology components and how they work together: 1. Natural Language Processing (NLP) and Understanding (NLU) Functionality: These AI components are the core of the chatbot's ability to understand human language. NLP breaks down and interprets the user's input (text or voice), while NLU comprehends the intent behind the input. How It Works: When a customer sends a message, NLP and NLU analyze the text to grasp the query's context and intent. This understanding allows the chatbot to generate an appropriate response. 2. Machine Learning (ML) Functionality: ML algorithms enable the chatbot to learn from interactions and improve responses over time. It analyzes patterns in data to predict and enhance future conversations. How It Works: Through continuous training on customer interactions, the chatbot becomes better at predicting user needs and personalizing responses, thereby improving engagement and satisfaction. 3. Integration APIs Functionality: APIs allow the chatbot to interact with external systems and databases, enabling it to retrieve and update information in real-time. How It Works: When a customer asks a question requiring specific data (e.g., account balance), the chatbot uses APIs to fetch the relevant information from the backend systems and deliver it to the user. 4. Sentiment Analysis Functionality: Sentiment analysis technology assesses the emotional tone behind a user's message, helping the chatbot to tailor its responses more empathetically. How It Works: By analyzing the sentiment of the user's text, the chatbot can adjust its tone and responses to better align with the user's emotional state, enhancing the engagement quality.         The technology can be applied across various domains such as customer service, HR recruitment, and internal operations efficiency. Its applications include: enhancing customer interaction through WhatsApp and omnichannel chatbots, supporting staff with AI-driven tools for operational efficiency, tailoring GPT models for industry-specific needs and customized requirements, automating email categorization, deriving insights from data analytics and customer feedback. These applications aim to streamline processes, personalize customer engagement, and optimize decision-making through data-driven insights. The unique value proposition lies in its comprehensive suite of AI-driven solutions designed to automate and enhance both customer engagement and internal operations. Their offerings range from WhatsApp messaging for improved customer interaction to omnichannel AI chatbots, specialized AI for HR and staff support, to industry-specific GPT models. They focus on personalizing customer experiences, streamlining recruitment processes, and delivering actionable insights through data analytics, positioning themselves as a versatile AI partner for businesses looking to leverage advanced technologies for operational efficiency and customer satisfaction. Infocomm, Artificial Intelligence
Effective and Versatile Deodorant Solution for Odor Removal
Issues associated with odor generation present significant challenges in various aspects of daily life, encompassing unpleasant smells from various sources such as toilets, kitchens, pets, tobacco, hospitals, and transportation. These unwanted odors have a detrimental impact on individual well-being, social interactions, and overall environmental quality. Deodorants play a crucial role in addressing these challenges, fostering a more comfortable and hygiene environment. However, conventional deodorants primarily rely on masking the unwanted odors with a strong fragrance, resulting in a slow and ineffective deodorization process, particularly against strong smells. The technology owner has developed a proprietary formulation that offers an effective deodorization approach. Unlike common deodorants, the unique deodorant using the proprietary formulation can remove the sources of unpleasant smells through chemical reactions. It demonstrates remarkable efficiency against a broad spectrum of odors, including those from rotting fish and meat, rotting eggs and milk, rotting vegetable waste, ammonia in toilets, sweat, and body odor. This innovative solution has the potential to revolutionise odor control across diverse scenarios. The technology owner is seeking R&D collaboration with industrial partners who are interested in incorporating this deodorant into their products and applications. Compared to conventional deodorants, this deodorant quickly interacts with unpleasant odor molecules and immediately envelops, degrades, and neutralizes the molecule, eliminating the unpleasant odor around it. Key features of this technology are: Universally against the four major malodors (i.e., ammonia, trimethylamine, methyl mercaptan, and hydrogen sulphide) Distinctive technique utilising zinc ions to decompose hydrogen sulfide, the source of putrefaction and fecal odor Effectively decompose human body odor and pet odor by using inorganic salts Reliable and efficient deodorization with a high deodorizing rate This innovative deodorant can be used in many situations since it is universally effective against the major odors in daily life. Potential scenarios include (but are not limited to): Transportation: public transportation or private cars. It effectively neutralises unpleasant odors during long trips, ensuring a comfortable space for passengers. Medical institutions: hospitals and clinics. It eliminates various odors occur in health care facilities, maintaining a comfortable environment for patients and staff. Hotels and accommodation: hotel rooms, shared spaces, and the entire accommodation. It provides a clean and comfortable environment, accommodating different guest preferences. Educational institutions: school and university classrooms, libraries, and common areas. It delivers safe and effective deodorizing effects for diverse population, including youth. Event Venue: indoor and outdoor events, concerts, and sporting occasions. It is particularly useful for odor control in places where many people gather. Effective deodorization against four major odors Enhance high safety in human health Low price despite its high effectiveness Customisable to meet different specifications Deodorization, Environment, Housing, Public, Odor Materials, Composites, Chemicals, Additives, Sustainability, Sustainable Living
Tactile and Temperature Sensing Electronic Skin for Healthcare and Cosmetic Applications
The human skin is the largest organ of the body, capable of extremely sensitive sensing ability and functional characteristics including elasticity, mechanical resistance and self-healing due to different mechano-receptors and sensory nerves. Electronic skin (e-skin) or synthetic skin, is a thin electronic material that stimulate the characteristics of the skin, making it possible for applications in prosthetics, robotics, wearables devices and percutaneous drug delivery systems. This patented technology is an e-skin with tactile, pain and temperature sensing, capable of differentiating various mechanical forces, sensory heat or moisture concurrently. It is a promising technology for healthcare applications. Currently, majority of the sensors in the market for healthcare are in rigid forms and for small application areas, which make it difficult for portable and wearable applications in large surface areas. This thin film flexible electronic skin can detect applied pressure and temperature on it. The skin’s electrical resistance varies with applied pressure and temperature. By measuring the skin’s electrical resistance, the applied pressure and temperature can be derived. The skin can be made stretchable to be covered on irregular curved surfaces. These features complement the drawbacks of rigid sensors for healthcare applications. The technology owner is looking for collaborators in the medical and robotics sectors and potential opportunities outside of healthcare such as beauty and cosmetics. Skin size, shape, density: customizable Pressure and temperature detection ranges: customizable (up to 5000KPa and 120°C) Single sensor repeatability: less than 10% Thickness: less than 1mm Communication port: via digital IO, UART, USB, Bluetooth, and Wi-Fi Data storage: SD card or other storage media Working voltage: DC 3-5V, or customizable The electronic skin can be: Embedded in insole for fall risk warning, fall detection, gait analysis, foot, and leg abnormality detection. Embedded in rehabilitation glove for finger gripping strength assessment. Embedded in surgical glove, robot end-effector and body for tactile sensing and force feedback control. Embedded in bed for bed sore prevention. Covered on artificial limb for pressure, temperature, and collision sensing. Deployed at shower room or bed side for fall detection. Used for teeth alignment and tongue muscle strength measurement. Used for training of doctor to operate surgical robot, under AR, MR, metaverse environment. Wearable electronic devices with skin-like properties will provide various applications for monitoring of human physiological signals such as body pressure, temperature, motion, and disease-related signals.  Low cost.  Customizable and durable electronic skins based on requirements. Compared with rigid sensors, these electronic skins have soft surfaces, can be made in large size, and covered on various flat and curved surfaces.  Possible to develop an interface to connect the e-skin to human neural brain or spinal cord. API under Windows, Linux, Android, and iOS to facilitate development of various applications.  Electronic skin, Tactile sensing, Pressure mapping, Temperature mapping Electronics, Sensors & Instrumentation, Personal Care, Cosmetics & Hair, Healthcare, Medical Devices, Infocomm, Internet of Things
Data Centre Electrical Asset Monitoring Platform
Driving sustainability, efficiency and carbon reduction in data centres is a complex and increasingly challenging requirement. The increased global use of high-definition video streaming, conversational AI modelling technologies and online meeting platforms puts increasing strain on data centres.  To meet these complex challenges, an AI, data-driven solution is required. The proprietary solution proposed herein is a data acquisition and analytics system designed for deployment in data centres.  The solution employs non-intrusive clip-on current transformers which are easily installed at electrical distribution boards, which continuously gather current signatures information at a high sampling rate. This enables AI algorithms to detect subtle changes and patterns in the electrical signature of each connected asset or device. Monitoring electrical assets has traditionally been complex and costly, requiring multiple sensors and expensive systems, and often requires deployment near to the asset or device to be monitored. This has led to widespread under-monitoring, resulting in expensive maintenance and significant energy inefficiencies. The solution extracts a proprietary set of deep energy data from electrical devices such as, uninterrupted power supplies (UPSs), Chillers, power distribution units (PDUs) and air conditioning and can be easily installed on both new and existing infrastructure. It offers real-time monitoring and reporting on important metrics such as real-time power usage effectiveness (PUE) and enables automation of sustainability reporting. This technology offers an industry-changing solution: a non-intrusive cost efficient AI-powered monitoring system that is easy to install. It generates a proprietary data set that fuels machine learning algorithms, enhancing efficiency and reducing total cost of ownership for data centre managers and owners.  The technology owner is seeking opportunities to demonstrate the capabilities in the data centre environment, preferably based in Singapore. Only a current transformer is required for each device, greatly reducing cost and increasing reliability. The proprietary current transformers are easily clipped onto electrical circuitry. The system can be installed into new or retrofitted into data centres and operates from its own independent network. Installation can be done by a locally qualified electrician. Additionally, fully assembled rack mounted solutions with a simple plug in feature available on server racks infrastructure. High-frequency electrical signature collection. The circuit transformer sensors are tethered to electrical circuits. These sensors acquire high-frequency electrical data, and the data is then fed into the intelligent monitoring system. The system has specialised machine learning algorithms specifically designed to provide valuable insight into the unique challenges of the built environment. Multiple data points are analysed based on power quality, asset condition, electrical safety, arcing, and carbon reduction. Proprietary hardware/software platform to make data acquisition and installation as un-intrusive, easy and cost-effective as possible. Web console for easy data visualization and open API for integration with other systems. Growing knowledge base and algorithm library to add value to the unique building environment. Dedicated in-house data solutions team with exceptional data science expertise that can understand and solve the bepsoke challenges of specific buildings and assets. All data is also made available for direct download and local processing via a comprehensive Application Programming Interface (API). Opportunities provided by the system Power Quality Monitoring - voltage, phase balance, power factor. Electrical device condition monitoring for predictive maintenance. Fault prediction and detection for maximising availability. Energy optimisation, cost savings and carbon footprint reduction. Arc detection capabilities for identification of fire hazards. Real-time warning and notifications. The system is ideally suited for the complex and ever increasing demands of data centres. Driving efficiencies in these environments, monitoring asset longevity and procurement, ESG reports and staff efficiencies are critical in this expanding sector The solution gathers an unprecedented level of data, simultaneously monitoring thousands of different data points at any one time. The level of granularity provides a rich level of insight hitherto deployed at scale in most sectors. Typically alternative technologies, such as sensors can be costly, require regular configuration, and are not always part of a scalable solution where things such as condition-based monitoring have to be done on a site-by-site basis as opposed to a learn and deploy model. Net Zero, Condition-Based Monitoring, Carbon Reduction Technology, Digital Transformation, Data Acquisition Platform, Data Analysis Platform, Machine Learning Algorithms, AI technology, Digital Insights, Fault Prediction, Power Factor, ESG Reporting, Energy Optimisation, Power Quality Green Building, Sensor, Network, Building Control & Optimisation
SeaLLMs - Large Language Models for Southeast Asia
Despite the remarkable achievements of large language models (LLMs) in various tasks, there remains a linguistic bias that favors high-resource languages, such as English, often at the expense of low-resource and regional languages. To address this imbalance, we introduce SeaLLMs, an innovative series of language models that specifically focuses on Southeast Asian(SEA) languages. SeaLLMs are built upon the Llama-2 model and further advanced through continued pre-training with an extended vocabulary, specialized instruction and alignment tuning to better capture the intricacies of regional languages. This allows them to respect and reflect local cultural norms, customs, stylistic preferences, and legal considerations. Highlights: The models' attunement to local norms and legal stipulations—validated by human evaluations—establishes SeaLLMs as not only a technical breakthrough but also a socially responsiveinnovation. SeaLLM-13b models exhibit superior performance across a wide spectrum of linguistic tasks and assistant-style instruction-following capabilities relative to comparable open-source models. SeaLLMs outperform mainstream commercialized models for some tasks in non-Latin languages spoken in the region, meanwhile, SeaLLMs are efficient, faster, and cost-effective compared to commercialized models. The SeaLLMs went supervised finetuning (SFT) and specialized self-preferencing alignment usinga mix of public instruction data and a small number of queries used by SEA language native speakers in natural settings, which adapt to the local cultural norms, customs, styles and laws inthese areas. SeaLLM-13b models exhibit superior performance across a wide spectrum of linguistic tasks and assistant-style instruction-following capabilities relative to comparable open source models. Moreover, they also outperform other mainstream commercialized models in tasks involving very low-resource non-Latin languages spoken in the region, such as Thai, Khmer, Lao,and Burmese. Training Process Our pre-training data consists of more balanced mix of unlabeled free-text data across all SEA languages. We conduct pre-training in multiple stages. Each stage serves a different specific objective and involves dynamic control of (unsupervised and supervised) data mixture, as well as data specification and categorization. We also employ novel sequence construction and masking techniques during these stages.Our supervised finetuning (SFT) data consists of many categories. The largest and most dominantof them are public and open-source. As the aforementioned are English only, we employed several established automatic techniques to gather more instruction data for SEA languages through synthetic means. For a small number of SFT data, we engaged native speakers to vet, verify and modify SFT responses so that they adapt to the local cultural customs, norms, and laws. We also adopted safety tuning with data for each of these SEA countries, which helps to address many culturally and legally sensitive topics more appropriately - such tuning data tend to be ignored, or may even appear in conflict with the safety-tuning data of other mainstream models. Therefore, we believe that our models are more local-friendly and abide by local rules to a higher degree. We conduct SFT with a relatively balanced mix of SFT data from different categories. We make use of the system prompt during training, as we found it helps induce a prior which conditions the model to a behavioral distribution that focuses on safety and usefulness.   Through rigorous pre-training enhancements and culturally tailored fine-tuning processes,SeaLLMs have demonstrated exceptional proficiency in language understanding and generation tasks, challenging the performance of dominant commercial players in SEA languages, especially non-Latin ones. The models’ attunement to local norms and legal stipulations—validated by human evaluations—establishes SeaLLMs as not only a technical breakthrough but a socially responsive innovation, poised to democratize access to high-quality AI language tools across linguistically diverse regions. This work lays a foundation for further research into language models that respect and uphold the rich tapestry of human languages and cultures, ultimately driving the AI community towards a more inclusive future. One of the most reliable ways to compare chatbot models is peer comparison. With the help ofnative speakers, we built an instruction test set, called Sea-bench that focuses on various aspects expected in a user-facing chatbot, namely: (1) task-solving (e.g. translation & comprehension), (2)math-reasoning (e.g., math and logical reasoning questions), (3) general-instruction (e.g.,instructions in general domains), (4) natural-questions (e.g., questions about local context often written informally), and (5) safety- related questions. The test set also covers all languages that we are concerned with. AI model candidates' responses to the test set's instructions may be judged and compared by human evaluators or more powerful large and commercialized AI models to derive a reliable performance metric. Through this process, we demonstrate that our SeaLLM-13b model is able to perform on-par or supasses other open-source or private state-of-the-art models across many linguistic and writing tasks. Infocomm, Artificial Intelligence