# daFPGASwitch or Simple Switch? # daFPGASwitch "Original Design" # Central Memory Unit (CMU) (Demo with code) # Scheduling algorithm: Doubly Round Robin - Targeting for fairness. - By average, each ingress gets a equal change to pick first. (First RR) - By average, each virtual queue of ingress gets a equal change to be picked first. (Second RR) - Takes 5 cycles (Falls within the 8 cycle heartbeat) - One cycle for continuing unfinished packet - One cycle for each ingress, deciding with one to pick (decided by all\_comb module pick\_voq) ### MetaData | | Dest Port | Src Port | Length | Start Time | End Time | |-----|-----------|----------|----------|------------|-----------| | | (2 bits) | (2 bits) | (6 bits) | (11 bits) | (11 bits) | | - 1 | (= ::::) | (= 5.15) | (0 0.10) | ( | ( ) | ### Packet Data | eart Time End Time | Src MAC | Data Payload | |--------------------|---------------------|-----------------| | 4 bytes) (4 bytes) | (6 + 2 empty bytes) | Variable length | Simple Switch "Updated Design" Simple Switch is the version loaded on the FPGA. But daFPGASwitch has more interesting features we would like to present. # Scheduling algorithm: Doubly Round Robin - How to be biased -> Give some ingress/egress more cycles - We have a list that sets the priority of each egress. When each ingress considers its scheduling decision, it will proritize to send the packets with higher priorities. - (Demo the code) #### Register Allocation - One control register - Lower 2 bits: 0 means reset, 1 means taking input but do not output, 2 means input while output - Third bit: sched policy (0 for RR, 1 for Priority) - 4~11: Scheduling priority Four port registers: each port is allocated one register for packet data 1-bit scheduling policy 8-bit scheduling #### How SW-HW talks Software "polls" with ioread32, which generates a high "read" signal for the Avalon slave. For HW, the read signal is like an ack signal ("read" means that sw has already consumed the packet segment). Software "interrupts" with iowrite32, which generates a high "write" signal for the Avalon slave. For HW, the write signal is like an enable signal ("write" means that sw has already put the packet\_data on the writedata wire.) a picture of the Avalon slave here. Figure 11. Read and Write Transfer with Fixed Wait-States at the Agent Interface # Results (32 packet in total) | Total Latency | Priority based (egress 0 has highest priority) | Doubly RR | |-------------------|------------------------------------------------|-----------| | Even load | 6.17 | 6.21 | | Overload egress 0 | 4.5 | 4.9 | They obtain similar average performance. Keep in mind that priority based implementation guarantees low latency for egress 0 by compromising the performance of other ports. #### Takeaway: - Implementing algo in hw is different - State machine is hard - Manual is so helpful (Avalon bus, read signal length) - Start with drawing timing diagram: One cycle at a time. ### Demo