Skip to the content.

SystemC Cosimulation Walk-Through

Prerequisites

Before going through this lab it’s highly recommended that you read the first tutorial in the Getting started with TLM 2.0 tutorial. The tutorial is designed to help you get familair with import TLM 2.0 concepts like Sockets, Generic Payload and Blocking Transport which will be used throughout this lab.

Also consult the SystemC language reference manual if you want more information on how TLM 2.0 features should be used.

The Cosimulation Enviornment

The cosimulation enviornment is broken up into two seperate processes running on a host. The first is the QEMU simulation process that simulates the zynq7020 soc’s arm processor. The second is the SystemC simulation kernel meant to simulate the zyn7020 soc’s programmable fabric. Looking at hw/zynq_demo.cc we can see all of the components present on the SystemC side instantiated in the top module.

Top(sc_module_name name, const char *sk_descr, sc_time quantum) :
 bus("bus"),
 zynq("zynq", sk_descr),
 debug("debug"),
 rst("rst")

Inside the top modules’s initializer we can see that SystemC is composed of 1) a zynq module 2) a bus meant for communication between modesls 3) a debug device used as an example “ip” in the FPGA fabric 4) a reset signal used to coordinate system resets between QEMU and the zynq module in the SystemC process. For this lab we’re going to focus on three modules: bus, zynq, and debugdev. The zynq module acts as a representative of the QEMU process inside the SystemC process. It recieves communication requests from QEMU over regular unix sockets and converts them into TLM transactions that can be consumed by other devices on the SystemC side. The real zynq 7020 SOC has several AXI general purpose master and slave interfaces as well as other special purpose interfaces like HP and ACP. For the purpose of this lab we’re going to focus on only one of the general purpose master interface m_axi_gp[0]. This general purpose master interface is modeled using tlm_utils::simple_initiator_socket connected to a target socket on the bus using a bind call. This bind call allows the zynq module to communicate with different devices on the bus.

zynq.m_axi_gp[0]->bind(*(bus.t_sk[0]));

This begs the question, where does QEMU fit into all of this? At a high level, when the QEMU simulation encounters a load instruction on address outside the simulated DDR range it constructs a packet with the requested operation (load/ store) and sends it over to the SystemC process. On the SystemC side the zynq module recieves that packet and converts into a TLM packet. It then forwards that packet to the bus using the m_axi_gp interface/ initiator socket so that it can be routed by the bus and consumed by other IPs modeled on the SystemC side.

Great, so how does the bus actually route TLM packets to devices connected to it? Using the bus.memmap call we can map address ranges to any devices’s target socket(s). This allows transactions recieved by the bus to be routed to right devices depending on the transaction’s destianations address. See the below example of the address range 0x48000000-0x480000FF being mapped to the debugdevices target socket.

bus.memmap(0x48000000ULL, 0x100 - 1,
		ADDRMODE_RELATIVE, -1, debug.socket);

So if the QEMU simulation does a load/ store operation to an address within the 0x48000000-0x480000FF range, that operation will be converted into a TLM transaction that gets sent over to the debug device. So how does the debug device actually process the transaction? In hw/debugdev.cc the debugdev constructor registers a callback to handle transactions recieved by it’s socket using the register_b_transport api call. The b_transport function registered is responsible for handling packets sent to the debugdev by the bus.

Below is an illustration of the enitre system will all of it’s connections highlighted.

CosimEnviornment

Anatomy of the debugdev

The debugdev is a simple SystemC TLM device designed to introduce TLM concepts with the SystemC+QEMU cosim enviornment. At the heart of the debugdev is the b_transport function call that process TLM packets. The b_transport begins by disassembling key parts of the TLM packet, namely the type of command embedded in the TLM packet (read/ write), the relative address accessed within the debugdev and a data ptr used to copy data that needs to be sent back to the inititator of the transaction (in case of read).

void debugdev::b_transport(tlm::tlm_generic_payload& trans, sc_time& delay)
{
	tlm::tlm_command cmd = trans.get_command();
	sc_dt::uint64 addr = trans.get_address();
	unsigned char *data = trans.get_data_ptr();
	unsigned int len = trans.get_data_length();
	unsigned char *byt = trans.get_byte_enable_ptr();
	unsigned int wid = trans.get_streaming_width();
	...

After unpacking the different pieces of the TLM packet the b_transport branches off based on whether the cmd is a read or a write. In case of a read operation a switch statement checks the relative address and populates the data ptr with different values depending on the requested address. For example if the relative addres is 0x0 the current simulation time in seconds is copied into the TLM packet.

	if (trans.get_command() == tlm::TLM_READ_COMMAND) {
		sc_time now = sc_time_stamp() + delay;
		uint32_t v = 0;
		switch (addr) {
			case 0:
				v = now.to_seconds() * 1000 * 1000 * 1000;
				break;
			...
		}
		memcpy(data, &v, len);
		...

If the transaction is a write to 0x0 for example the SystemC simulation will print out the contents of the data ptr to STDOUT.

	else if (cmd == tlm::TLM_WRITE_COMMAND) {
		static sc_time old_ts = SC_ZERO_TIME, now, diff;
		now = sc_time_stamp() + delay;
		diff = now - old_ts;
		switch (addr) {
			case 0:
				cout << "TRACE: " << " "
				     << hex << * (uint32_t *) data
				     << " " << now << " diff=" << diff << "\n";
				break;
			...

The switch statement can be used to create memory mapped registers within the debugdev. Reads and writes to these registers can then have side effects as with regular memory mapped registers.

Talking to the debugdev from userspace inside QEMU

We can talk to any memory mapped devices/ registers from userspace within qemu using the mmap system call. In fact we can create a simple userspace driver using mmap to test out the debugdev’s functionality. Note that using mmap in userspace requires elevated priveledges using sudo. Thankfully the only user available in qemu is root so we don’t have to deal with sudo.

Taking a look at sw/debugread.c we can see that the program first defines the base address of the debugdev #define SYSTEMC_DEVICE_ADDR (0x48000000) then a file descriptor to /dev/mem is created to allow the mmap system call to map any physical address we want to the userspace processes’s virtual memory fd=open("/dev/mem",O_RDWR) then the mmap system call is performed on the desired base address pDev=mmap(..., fd,(SYSTEMC_DEVICE_ADDR ...)); (A full explanation of the mmap system calls different arguments are available through man mmap on most linux based systems). The mmap system call returns a ptr that maps to the debugdev’s base physical address. Using that pointer we can initiate read/write transaction at any relative address within the debugdev. debugread initiates read transactions to relative address 0x0 inside the printf call printf("TIMER = %u\n", *((unsigned int*)pDev)). Since the printf is in a loop each printf creates a TLM transaction that gets sent to the debugdev’s b_transport function.