Neuchips Driving AI Innovations in Inferencing
By Stephen Las Marias, EE Times Asia
TAIPEI, April 17, 2024 /PRNewswire/ -- The global semiconductor market experienced a challenging year in 2023. According to the Semiconductor Industry Association (SIA), worldwide chip sales reached $526.8 billion in 2023, down by 8.2% year-on-year (YoY).
Apart from the cyclicality of the IC industry, the memory sector's significant decline contributed to this weak performance. According to market analyst Gartner Inc., revenue for memory products dropped by 37% last year—the biggest decline of all the segments in the semiconductor market.
Nevertheless, there were bright spots in the second half of the year, led by the AI sector. The growth of AI-based applications in many sectors, including data centers, edge infrastructure, and endpoint devices, has set off a new wave of AI in 2023.
According to market analyst Counterpoint Technology Market Research, AI provided the positive news to the semiconductor industry, emerging as a key content and revenue driver, especially in the second half of 2023.
AI, in fact, is expected to lead the semiconductor recovery in 2024. According to Gartner, AI chips represented a $53.4 billion revenue opportunity for the semiconductor industry in 2023, up by about 21% YoY. It projects a continued double-digit growth for the sector over, reaching $67.1 billion in 2024, and growing to more than double the size of 2023's market to $119.4 billion by 2027.
"There are a lot of opportunities in the AI space," says Ken Lau, CEO of AI chip startup Neuchips. "If you look at any public data, you will see that AI, in particular, generative AI [GenAI], could be a trillion dollar market by 2030 timeframe. A lot of money is actually being spent on training today, but the later part of the decade will see investments going to inferencing."
Lau notes that they are seeing different usage models on inferencing going forward. "After you train the data, you have inferencing to help you do work better. For example, different companies are going to use AI to augment their chat bots or customer service capabilities. Even the way people do speech for products. For instance, a spokesperson for a particular brand can use an AI to totally go for it. AI can train the way you dress and everything else. When consumers ask questions, the spokesperson will answer describing a brand, and when customers click the brand, they will be driven to a website where they can buy the product," he explains. "I think there are ways that we can't even imagine going forward. The opportunities are limitless for AI. That's how I see it. And a big part of that is going to be inferencing, not just training."
Focus on inferencing
Established in 2019, Neuchips set its sight on inferencing, specifically a recommendation engine, as they know that inferencing plays a vital role in the future.
One rationale behind this is that a lot of datacenters use a recommendation engine. "When you buy parts, or whatever product online, they recommend something. For example, when you buy a tennis racket from this brand, it will also recommend another brand," says Lau.
So, Neuchips picked a recommendation engine to go after, used FPGAs to build a prototype and prove out the design works, and then they designed the chip.
The inference chip, N3000, which came out in 2022, turned out to be quite well and proved to be 1.7x better than competitive products in the market in terms of performance/watt based on MLPerf 3.0Benchmarking.
"When we built this chip, we have the recommendation engine in mind. We built it for the purpose of recommendation," explains Lau. "But when GenAI turned a corner, we tried it on our chip, and we were able to reproduce it. That's because the memory subsystems are optimized for recommendation engine. The same memory subsystem can be applied to GenAI as well. When we did the demo at the AI Hardware Summit in the US, and also SC23, we are one of the not so many AI companies to showcase the demo case by using our own chip on ChatBot to let users try on."
At the recent EE Awards Asia 2023, Neuchips' N3000 was a recipient of the "Best AI Chip" award. "It shows the level of execution that we can do here in Taiwan," says Lau. "If you look at large companies doing chip design today, they are not doing core logics. They are using smaller chips. We are one of the few companies that employ 7nm doing compute. That is why it is important. And we were able to achieve a performance for recommendation that is 1.7x better than others. There's something to be said about that."
Lau proudly says they made the device with only one slicing. "Other companies can do multiple cuts to make the chips right. For our N3000 product, we only have one chance because we are just a startup—we have no money to waste. So, we did it in one chance and it worked. I think it is a significant achievement and reflects the level of execution that we have."
Industry challenges
Despite optimistic estimates, the AI semiconductor segment continues to face a multitude of challenges, depending on customers and their applications.
"There are companies out there that want to integrate AI into their portfolio of product offerings or include in their service," explains Lau. "One of the challenges here is the software integration part. And how will you train the internal data? For example, if I am a hospital, all the data sets should be private. I cannot go to cloud. How can I use those data and train them so that the doctors can have access to them in a more meaningful way?"
Training those data at the enterprise level could be key, according to Lau, because, for example, a hospital would not employ a software engineer just to train their data.
"They will need that kind of software service and hardware in their own enterprise going forward, because their data is private," notes Lau. In line with this, he sees the enterprise segment picking up.
Another challenge that continues to plague the chip industry is power. And AI chips—with their high compute power—cannot escape this issue.
"It depends on what kind of edge device you put it in," says Lau. "First of all, our chips can go down to around 25W to 30W. The standard is around 55W, but we were able to compress it into a dual M.2 form factor, so they can go down to 25-30W. With that in mind, we can put it into a PC without a problem. That only requires a passive heatsink and a fan, for example. But that may still be a little bit big. But for laptops, we are not going to put it in there, to be honest, because 20W is pretty high for a laptop to handle. But it doesn't preclude people from building docking stations that can be attached to a laptop as GenAI device. Those are the things that we can do on a PC."
Meanwhile, to help customers address their challenges, Neuchips come from two different angles: hardware and software.
"One, we provide the hardware. When you are a data center, you are not going to have high-power connections," says Lau. "Our chips are low power, and we are able to fit in the smallest of places. Our products can fit into 1U servers, a desktop, with our different form factor card. Second, we also provide all software stacks, SDKs [software development kits], as well as drivers and everything else."
Neuchips can also offer customers integrating or training data services as well. "Training using their own data, and giving it back to them, and then providing hardware, will them become more efficient. This will create a win-win situation for us and the customer," says Lau.
Future plans
Lau says the training and edge applications will be the main drivers for AI applications in the future.
"But, if you look at all the news today, the AI PC, I believe some of the newer applications providers will come up with new ways to do GenAI inferencing," he says. "We are in an unchartered area, but we expect this to grow—but we also need the applications ecosystem to grow at the same time.
Moving forward, Neuchips will focus on different form factors. Apart from its dual M.2 form factor device, the company also has another module that can go to standard PCI Express slots, for applications in PC or low-end workstations.
SOURCE EETimes Taiwan
Share this article