Composable disaggregated data center infrastructure promises to change the way data centers for modern workloads are built. However, to fully realize the potential of new technologies, such as CXL, the industry needs brand-new hardware. Recently, Samsung introduced its CXL Memory Module Box (CMM-B), a device that can house up to eight CXL Memory Module – DRAM (CMM-D) devices and add plenty of memory connected using a PCIe/CXL interface.

Samsung's CXL Memory Module Box (CMM-B) is the first device of this type to accommodate up to eight 2 TB E3.S CMM-D memory modules and add up to 16 TB of memory to up to three modern servers with appropriate connectors. As far as performance is concerned, the box can offer up to 60 GB/s of bandwidth (which aligns with what a PCIe 5.0 x16 interface offers) and 596 ns latency. 

From a pure performance point of view, one CXL Memory Module—Box is slower than a dual-channel DDR5-4800 memory subsystem. Yet, the unit is still considerably faster than even advanced SSDs. At the same time, it provides very decent capacity, which is often just what the doctor ordered for many applications.

The Samsung CMM-B is compatible with the CXL 1.1 and CXL 2.0 protocols. It consists of a rack-scale memory bank (CMM-B), several application hosts, Samsung Cognos management console software, and a top-of-rack (ToR) switch. The device was developed in close collaboration with Supermicro, so expect this server maker to offer the product first.

Samsung's CXL Memory Module – Box is designed for applications that need a lot of memory, such as AI, data analytics, and in-memory databases, albeit not at all times. CMM-B allows the dynamic allocation of necessary memory to a system when it needs this memory and then uses DRAM with other machines. As a result, operators of datacenters can spend money on procuring expensive memory (16 TB of memory costs a lot), reduce power consumption, and add flexibility to their setups.

Source: Samsung



View All Comments

  • back2future - Friday, April 5, 2024 - link

    [ If a memory access request is for blocks larger than ~32kB, than data transfer (maybe independent from cpu cycles, additional latency because of checking cache coherency?) time is already higher compared to initial data request latency?
    Other number seen mentioned about typically addition of ~200ns for CXL memory controller ]
  • The Von Matrices - Friday, April 5, 2024 - link

    There is an insightful article on SemiAnalysis regarding this concern. The main part of the article is behind a paywall but the free section gets their general arguments across well.
  • Elstar - Sunday, April 7, 2024 - link

    I think when your dataset is that big, you’ll take any improvement in latency/throughput you can get, so CXL memory is a huge improvement over, say, a highly parallel NVMe flash array, or just not tackling a problem at all because current hardware is too slow. Reply
  • Dolda2000 - Saturday, April 13, 2024 - link

    Perhaps I'm mistaken, but my impression is that the main (not only) purpose for CXL is configurable infrastructure rather than capacity. Reply
  • thomasjkenney - Thursday, April 4, 2024 - link

    In a way, this is a kind of reversion to the dawn of electric computing. Each device or module had a separate rack, or even room, and all were cabled together with spaghetti. I remember my mentor teaching me how to wire-wrap the back of a UniBus backplane...Jimminy Christmas! Reply
  • back2future - Friday, April 5, 2024 - link

    [ read, that CXL (version?, 3.x, theoretically saturates a PCIe7.0 x16 connection) is limited to ~4"(?), through merger with Gen-Z there's ethernet support (CXL version?, multi ~100Gbps, reduced latency cmprd to RDMA?, optimized ~400ns) up to few tens of meters(?) ] Reply
  • mode_13h - Monday, April 15, 2024 - link

    It almost feels demeaning how they put "memory module box" on the front, as if anyone who has any business touching it might not understand a more conventional name! Reply

Log in

Don't have an account? Sign up now