3.4 应用实例——基于SERDES的多路高速同步传输系统
3.4.1 系统方案
阵列信号处理(如常见的数字波束形成网络),特点是阵元多、带宽宽,所以要传输和处理的数据量大。基于ChipSync技术的高速数据传输系统是一种常用的传输方案。本节实例的完整工程代码可以从网站上获取。
高速数字信号采用 LVDS 进行传输,每路信号的 I、O 路分别占用一对 LVDS 信号。图3-19是数据传输实现结构图,其中发送端主要进行数据的串化输出。接收端需要根据训练序列利用动态相位调整技术对串行数据进行采样,时刻调整到最佳采样点,然后进行串并转换,并且根据同步训练字进行字序调整,把串并转换后的数据存入双口RAM中,接收端从RAM中读出数据。为了实现多路同步,数据存入RAM时增加了帧同步字,其原理如下:在接收端收到帧同步字后,对RAM的读写地址进行复位,之后的数据按顺序进行存储和读取,从而实现多路高速同步传输。
图3-19 多路信号高速传输、同步接收技术
采用上图的传输方式传输320 Mbps信号,数据传输的稳定性测试如图3-20所示,其中发送端发送正弦波。
由图3-20可见,接收数字波束形成的网络系统数据传输量大,传输速率高,经过可靠性设计,高速率数据能够正确、稳定地传输,下面详细分析实现过程。
图3-20 接收正弦信号结果图
3.4.2 发送模块
发送端首先发送训练字,在同步完成之后切换到发送实际信号。发送模块主要进行数据的串化输出,相对简单,具体实现代码如下。
-------------------------------------------------------------------- --发送模块主要代码-- -------------------------------------------------------------------- library IEEE; use IEEE.STD LOGIC 1164.ALL; use IEEE.STD LOGIC ARITH.ALL; use IEEE.STD LOGIC UNSIGNED.ALL; ---- Uncomment the following library declaration if instantiating ---- any Xilinx primitives in this code. library UNISIM; use UNISIM.VComponents.all; entity TestOserdes is port( datain:in std logic vector(9 downto 0); clk:in std logic; clkdiv:in std logic; rst :in std logic; --′0′is enable and′1′disable work
rst test:in std logic; --′1′for test and′0′for work dataout p:out std logic; dataout n:out std logic ); end TestOserdes; architecture Behavioral of TestOserdes is -- signal shiftdata1:std logic; signal shiftdata2:std logic; signal dataout:std logic; signal dataselect:std logic vector(9 downto 0); signal datain test:std logic vector(9 downto 0); -- begin tb test:process(clkdiv) begin if(rst test=′1′)then datain test<="0000000000"; elsif(clkdiv′event and clkdiv=′1′)then datain test<=datain test+1; end if; end process; u iostandarddata:OBUFDS generic map ( IOSTANDARD=>"DEFAULT") port map ( o=>dataout p, OB=>dataout n, I=>dataout ); -- u oserdes1:OSERDES generic map ( DATA RATE OQ=>"DDR", -- Specify data rate to "DDR"or "SDR" DATA RATE TQ=>"DDR", -- Specify data rate to "DDR","SDR", -- or "BUF" DATA WIDTH=>10, -- Specify data width-For DDR:4,6,8,or 10 -- For SDR or BUF:2,3,4,5,6,7,or 8 INIT OQ=>′0′, -- INIT for Q1 register-′1′or′0′ INIT TQ=>′0′, -- INIT for Q2 register-′1′or′0′
SERDES MODE=>"MASTER", --Set SERDES mode to -- "MASTER"or "SLAVE" SRVAL OQ=>′0′, -- Define Q1 output value upon SR assertion-′1′or′0′ SRVAL TQ=>′0′, -- Define Q1 output value upon SR assertion-′1′or′0′ TRISTATE WIDTH=>2) -- Specify parallel to serial converter width -- When DATA RATE TQ=DDR:2 or 4 -- When DATA RATE TQ=SDR or BUF:1 " port map ( OQ=>dataout, -- 1-bit output SHIFTOUT1=>open, -- 1-bit data expansion output SHIFTOUT2=>open, -- 1-bit data expansion output TQ=>open, -- 1-bit 3-state control output CLK=>clk, -- 1-bit clock input CLKDIV=>clkdiv, -- 1-bit divided clock input D1=>dataselect(9), -- 1-bit parallel data input D2=>dataselect(8), -- 1-bit parallel data input D3=>dataselect(7), -- 1-bit parallel data input D4=>dataselect(6), -- 1-bit parallel data input D5=>dataselect(5), -- 1-bit parallel data input D6=>dataselect(4), -- 1-bit parallel data input OCE=>′1′, -- 1-bit clcok enable input REV=>′0′, -- Must be tied to logic zero SHIFTIN1=>shiftdata1, -- 1-bit data expansion input SHIFTIN2=>shiftdata2, -- 1-bit data expansion input SR=>rst, -- 1-bit set/reset input T1=>′0′, -- 1-bit parallel 3-state input T2=>′0′, -- 1-bit parallel 3-state input T3=>′0′, -- 1-bit parallel 3-state input T4=>′0′, -- 1-bit parallel 3-state input TCE=>′0′ -- 1-bit 3-state signal clock enable input ); -- u oserdes2:OSERDES generic map ( DATA RATE OQ=>"DDR", -- Specify data rate to "DDR"or "SDR" DATA RATE TQ=>"DDR", -- Specify data rate to "DDR", -- "SDR",or "BUF" DATA WIDTH=>10, -- Specify data width-For DDR:4,6,8,or 10 -- For SDR or BUF:2,3,4,5,6,7,or 8 INIT OQ=>′0′, -- INIT for Q1 register-′1′or′0′ INIT TQ=>′0′, -- INIT for Q2 register-′1′or′0′ SERDES MODE=>"SLAVE", --Set SERDES mode to
--"MASTER"or "SLAVE" SRVAL OQ=>′0′, -- Define Q1 output value upon SR assertion-′1′or′0′ SRVAL TQ=>′0′, -- Define Q1 output value upon SR assertion-′1′or′0′ TRISTATE WIDTH=>2) -- Specify parallel to serial converter width -- When DATA RATE TQ=DDR:2 or 4 -- When DATA RATE TQ=SDR or BUF:1 " port map ( OQ=>open, -- 1-bit output SHIFTOUT1=>shiftdata1, -- 1-bit data expansion output SHIFTOUT2=>shiftdata2, -- 1-bit data expansion output TQ=>open, -- 1-bit 3-state control output CLK=>clk, -- 1-bit clock input CLKDIV=>clkdiv, -- 1-bit divided clock input D1=>′0′, -- 1-bit parallel data input D2=>′0′, -- 1-bit parallel data input D3=>dataselect(3), -- 1-bit parallel data input D4=>dataselect(2), -- 1-bit parallel data input D5=>dataselect(1), -- 1-bit parallel data input D6=>dataselect(0), -- 1-bit parallel data input OCE=>′1′, -- 1-bit clcok enable input REV=>′0′, -- Must be tied to logic zero SHIFTIN1=>′0′, -- 1-bit data expansion input SHIFTIN2=>′0′, -- 1-bit data expansion input SR=>rst, -- 1-bit set/reset input T1=>′0′, -- 1-bit parallel 3-state input T2=>′0′, -- 1-bit parallel 3-state input T3=>′0′, -- 1-bit parallel 3-state input T4=>′0′, -- 1-bit parallel 3-state input TCE=>′0′ -- 1-bit 3-state signal clock enable input ); -- end of oserdes tbselect process:process(clkdiv) begin if(clkdiv′event and clkdiv=′1′)then if(rst test=′1′)then dataselect<="0000011111";-- 同步训练字 else dataselect<=datain test; -- 实际要传输的信号 end if; end if; end process;
3.4.3 接收模块
1. 位同步实现模块
如果FPGA数据总线宽度为16 bit,当FPGA的内部总线宽度为64 bit位时,从引脚输入的串行数据将被转换成4 bit位宽的内部并行格式。在DPA训练过程中,发送端分5个周期发送训练序列“0000 0000 0011 1111 1111”,以5个周期为一次循环,如此反复。
在本例中,数据宽度为10 bit,采用级联方式实现并串转换。定义“0000011111”为同步字,作为各数据线解串后需匹配的固定数据格式。在接收端的ChipSync包括位校正(Bit Align-ment)和字校正(Word Alignment)两个模块,两模块的位置关系如图3-17所示。Bit Align-ment模块通过控制Idelay硬核的dlyce和dlyinc两输入信号,对每个数据线都进行精确的线延迟,使采样时钟沿正好位于各自信号数据窗口的中心位置。各个数据通道的Bit Alignment调整完成以后,分别给其Word Alignment模块发送ready信号,然后Word Alignment模块通过BITS-LIP字偏移核进行字调整,使各信号线的并行输出数据中出现同步字匹配,即“0000011111”,从而使接收端的各信号线的并行输出都保持同步,此时即可认为接收端已完成DPA功能,两个FPGA之间可以发送有效数据。
Bit Alignment位校正模块的主要功能是对输入的串行流进行精确的线延迟,最终使采样时钟沿置于被采样数据窗口的中心位置。由于各数据流经过不同的传输路径后所产生的线延迟不可知,只知周期性的训练序列被采样输出集合为“0000011111”、“1000001111”、“1100000111”、“1110000011”、“1111000001”、“1111100000”、“0111110000”、“0011111000”、“0001111100”和“0000111110”,而且输出并行数据不可能全为“0”或“1”,因此某并行数据中一定同时存在“0”和“1”。在“0”、“1”变化处为串行流电平的变化沿。位校正思想是通过增加延迟,使接收的串行流电平的变化沿位置必然发生变化,两次变化的时间间隔即为接收时间窗口,在接收时间窗口的中点即为最佳采样点。Bit Alignment位校正模块按照下述步骤进行。
(1)首先搜索出串行数据流电平的变化沿,具体实现方法为:先通过控制dlyce和dlyinc对信号线的延迟增加一级,然后将解串输出的10 bit并行数据和上一次接收数据进行比较。如果数据发生变化,则表明采样数据处理串行流的“0”、“1”位置变化,即采样点在接收数据时间窗口的左边缘,记录下此时的延迟拍数LeftEdge。
(2)继续通过控制dlyce和dlyinc增加对信号线的延迟,找到接收数据时间窗口的右边缘,记录下此时的延迟拍数Right-Edge。
(3)处于接收数据时间窗口左边缘和右边缘状态的延迟级数的中间位置即为最佳采样位置,通过控制dlyce和dlyinc信号不断递减其延迟值,最终使Idelay 的延迟级数调整为LeftEdge +(RightEdge-LeftEdge)/2。位校正的处理流程如图3-21所示。
图3-21 Bit Alignment位校正流程图
2. 字同步模块
当Bit Alignment位校正完成以后,通过其ready信号告知Word Alignment模块,采样时钟已置于各数据线的数据窗口的中心位置,可以对并行输出数据进行字调整。在DPA过程中,发送端发送训练字“0000011111”,经过Bit Alignment位校正后,各串行流经过并行输出的10bit 数据不一定为“0000011111”状态,此时接收端输出集合为“0000011111”、“1000001111”、“1100000111”、“1110000011”、“1111000001”、“1111100000”、“0111110000”、“0011111000”、“0001111100”和“0000111110”。因此,需要对输出数据进行移位操作,即字同步(Word Alignment)。
Word Alignment字校正的目的是通过BITSLIP操作使解串输出的并行输出值出现同步字“0000011111”,从而使接收端和发送端保持同步,接收数据有效。Word Alignment将并行输出值与“0000011111”相比较,如果输出值为“0000011111”,则该信号线不需要进行字调整;否则激活一次BITSLIP操作,即对BITSLIP信号输入一周期的高电平,如此反复,直至输出的数据出现同步字“0000011111”,则认为该信号线的Word Alignment字校正完成。其处理流程如图3-22所示。当各个信号线的Word Alignment字调整完成以后,整个接收端的DPA训练结束,此后本接口就可以用于传输应用数据。位同步、字同步时序仿真波形如图3-23所示。
图3-22 Word Alignment校正流程图
图3-23 位同步、字同步时序仿真时序图
3. 多路信号同步模块
在完成Bit Alignment和Word Alignment之后,接收端接收并检测帧同步,在本例中利用“0x2EB”作为标志,当接收端检测到“0x2EB”后,复位双端口RAM的读写地址,从而实现数据帧同步。在设计中,为了防止双端口RAM读写冲突,设计时使读写地址的初值相差7个时钟节拍。多路信号同步校正流程图如图3-24所示。
图3-24 多路信号同步校正流程图
接收模块主要控制代码如下。
-------------------------------------------------------------------- --接收模块主要控制代码-- -------------------------------------------------------------------- library IEEE; use IEEE.STD LOGIC 1164.ALL; use IEEE.STD LOGIC ARITH.ALL; use IEEE.STD LOGIC UNSIGNED.ALL; ---- Uncomment the following library declaration if instantiating ---- any Xilinx primitives in this code. library UNISIM; use UNISIM.VComponents.all; entity TestIserdes is port( datain p:in std logic; datain n:in std logic; clk153p6 MHz:in std logic; -- 153.6 MHz clk:in std logic; -- 76.8 MHz clk div:in std logic; -- 15.36 MHz rst :in std logic; --′0′is enable and′1′disable work clkdivout:in std logic; dataout:out std logic vector(9 downto 0) ); end TestIserdes; architecture Behavioral of TestIserdes is signal shiftdata1:std logic;
signal shiftdata2:std logic; signal datain:std logic; signal dataout reg:std logic vector(9 downto 0); signal dataout tmp:std logic vector(9 downto 0); signal sysstate:std logic; signal ct start:std logic vector(7 downto 0); signal ct end:std logic vector(7 downto 0); signal ct reg:std logic vector(7 downto 0); signal state tmp:std logic vector(1 downto 0); signal dlyce:std logic; signal dlyinc:std logic; signal dlyrst:std logic; signal rst bitalignment:std logic; signal bitslip:std logic; signal cn tmp:std logic vector(7 downto 0); signal cn tmp sra:std logic vector(7 downto 0); signal state bitalignment:std logic vector(2 downto 0); --if "10"show the end of bitalignment signal cn:std logic vector(7 downto 0); signal cn 1:std logic vector(7 downto 0); signal cn dly:std logic vector(7 downto 0); signal cn bitslip:std logic vector(2 downto 0); signal addrin:std logic vector(7 downto 0); signal addrout:std logic vector(7 downto 0); signal state delay:std logic; signal state downclk:std logic; signal cn upclk:std logic vector(1 downto 0); signal cn downclk:std logic vector(1 downto 0); signal state addrout:std logic vector(3 downto 0); signal dataout reg0:std logic vector(9 downto 0); signal dataout reg1:std logic vector(9 downto 0); component sramout port ( clka:IN std logic; dina:IN std logic VECTOR(9 downto 0); addra:IN std logic VECTOR(7 downto 0); wea:IN std logic VECTOR(0 downto 0); clkb:IN std logic; addrb:IN std logic VECTOR(7 downto 0); doutb:OUT std logic VECTOR(9 downto 0)); end component;
-- begin u iostandard:IBUFGDS generic map ( IOSTANDARD=>"DEFAULT") port map ( O=>datain, -- Clock buffer output I=>datain p, -- Diff p clock buffer input IB=>datain n -- Diff n clock buffer input ); u iserdes1:ISERDES generic map ( BITSLIP ENABLE=>TRUE, -- TRUE/FALSE to enable bitslip controller -- Must be "FALSE"in interface -- type is "MEMORY" DATA RATE=>"DDR", -- Specify data rate of "DDR"or "SDR" DATA WIDTH=>10, -- Specify data width-For DDR 4,6,8,or 10 -- For SDR 2,3,4,5,6,7,or 8 INTERFACE TYPE=>"NETWORKING", -- Use model-"MEMORY"or "NETWORKING" IOBDELAY=>"BOTH",-- Specify outputs where delay chain will be applied -- "NONE","IBUF","IFD",or "BOTH" IOBDELAY TYPE=>"VARIABLE", -- Set tap delay"DEFAULT","FIXED",or"VARIABLE" IOBDELAY VALUE=>0, -- Set initial tap delay to an integer from 0 to 63 NUM CE=>1, -- Define number or clock enables to an integer of 1 or 2 SERDES MODE=>"MASTER") --Set SERDES mode to -- "MASTER"or "SLAVE" port map ( O=>open, -- 1-bit output Q1=>dataout tmp(0), -- 1-bit output Q2=>dataout tmp(1), -- 1-bit output Q3=>dataout tmp(2), -- 1-bit output Q4=>dataout tmp(3), -- 1-bit output Q5=>dataout tmp(4), -- 1-bit output Q6=>dataout tmp(5), -- 1-bit output SHIFTOUT1=>shiftdata1, -- 1-bit output SHIFTOUT2=>shiftdata2, -- 1-bit output BITSLIP=>bitslip, -- 1-bit input CE1=>′1′, -- 1-bit input CE2=>′0′, -- 1-bit input
CLK=>clk, -- 1-bit input CLKDIV=>clk div, -- 1-bit input D=>datain, -- 1-bit input DLYCE=>dlyce, -- 1-bit input DLYINC=>dlyinc, -- 1-bit input DLYRST=>dlyrst, -- 1-bit input OCLK=>′0′, -- 1-bit input REV=>′0′, -- Must be tied to logic zero SHIFTIN1=>′0′, -- 1-bit input SHIFTIN2=>′0′, -- 1-bit input SR=>rst -- 1-bit input ); u iserdes2:ISERDES generic map ( BITSLIP ENABLE=>TRUE, -- TRUE/FALSE to enable bitslip controller -- Must be "FALSE"in interface type is "MEMORY" DATA RATE=>"DDR", -- Specify data rate of "DDR"or "SDR" DATA WIDTH=>10, -- Specify data width-For DDR 4,6,8,or 10 -- For SDR 2,3,4,5,6,7,or 8 INTERFACE TYPE=>"NETWORKING", -- Use model-"MEMORY"or "NETWORKING" IOBDELAY=>"BOTH",-- Specify outputs where delay chain will be applied -- "NONE","IBUF","IFD",or "BOTH" IOBDELAY TYPE=>"VARIABLE", -- Set tap delay "DEFAULT","FIXED",or "VARIABLE" IOBDELAY VALUE=>0,-- Set initial tap delay to an integer from 0 to 63 NUM CE=>1, -- Define number or clock enables to an integer of 1 or 2 SERDES MODE=>"SLAVE") --Set SERDES mode to "MASTER"or "SLAVE" port map ( O=>open, -- 1-bit output Q1=>open, -- 1-bit output Q2=>open, -- 1-bit output Q3=>dataout tmp(6), -- 1-bit output Q4=>dataout tmp(7), -- 1-bit output Q5=>dataout tmp(8), -- 1-bit output Q6=>dataout tmp(9), -- 1-bit output SHIFTOUT1=>open, -- 1-bit output SHIFTOUT2=>open, -- 1-bit output BITSLIP=>bitslip, -- 1-bit input CE1=>′1′, -- 1-bit input CE2=>′0′, -- 1-bit input
CLK=>clk, -- 1-bit input CLKDIV=>clk div, -- 1-bit input D=>′0′, -- 1-bit input DLYCE=>′0′, -- 1-bit input DLYINC=>′0′, -- 1-bit input DLYRST=>′0′, -- 1-bit input OCLK=>′0′, -- 1-bit input REV=>′0′, -- Must be tied to logic zero SHIFTIN1=>shiftdata1, -- 1-bit input SHIFTIN2=>shiftdata2, -- 1-bit input SR=>rst -- 1-bit input ); -- End of ISERDES inst instantiation u dataoutreg:process(clk div) -- 校正模块 begin if(rst=′1′)then dataout reg<="1111111111"; state bitalignment<="000"; cn 1<="00000000"; cn<="00111111"; rst bitalignment<=′1′; ct start<="00000000"; ct end<="00000000"; cn dly<="00000000"; dlyrst<=′0′; bitslip<=′0′; cn bitslip<="000"; cn tmp<="00000000"; cn tmp sra<="00000000"; elsif(clk div′event and clk div=′1′)then if(cn 1<120)then cn 1<=cn 1+1; dlyrst<=′0′; else cn 1<="01111111"; case (state bitalignment)is when "000"=> rst bitalignment<=′0′; cn<="00111111"; if(ct reg>=12)then ct start<="00000000"; state bitalignment<="001";
elsif(dataout reg/=dataout tmp)then ct start<=ct reg; state bitalignment<="001"; else state bitalignment<="000"; end if; when "001"=> dlyrst<=′1′; rst bitalignment<=′1′; state bitalignment<="010"; when "010"=> -- reset delay register:dlyrst<=′1′ dlyrst<=′1′; rst bitalignment<=′1′; state bitalignment<="011"; when "011"=> dlyrst<=′0′; if(ct start>6)then cn<=ct start-6; else cn<=ct start+7; end if; rst bitalignment<=′0′; state bitalignment<="100"; --bit alignment end when "100"=> if(ct reg>cn)then rst bitalignment<=′1′; state bitalignment<="101"; end if; when "101"=> -- word alignment if(dataout reg/="0000011111")then bitslip<=′1′; else bitslip<=′0′; end if; state bitalignment<="110"; when "110"=> bitslip<=′0′; if(cn bitslip<5)then cn bitslip<=cn bitslip+1; elsif(dataout reg/="0000011111")then state bitalignment<="101";
cn bitslip<="000"; else state bitalignment<="111"; end if; when "111"=> -- word alignment end rst bitalignment<=′1′; bitslip<=′0′; when others=> state bitalignment<="111"; end case; end if; dataout reg<=dataout tmp; end if; end process; -- end of output -- for adjust the bit-alignment u delay:process(clk div,rst bitalignment) begin if(rst bitalignment=′1′)then dlyce<=′0′; dlyinc<=′0′; state tmp<="00"; ct reg<="00000000"; elsif(clk div′event and clk div=′1′)then case (state tmp)is when "00"=> if(ct reg<cn)then ct reg<=ct reg+1; dlyce<=′1′; dlyinc<=′1′; state tmp<="01"; else ct reg<="00001100"; dlyce<=′0′; dlyinc<=′0′; end if; when "01"=> dlyce<=′0′; dlyinc<=′0′; state tmp<="00"; when others=>
ct reg<="00001100"; dlyce<=′0′; dlyinc<=′0′; state tmp<="00"; end case; end if; end process; tb ramout addrin:process(clk div) begin if(clk div′event and clk div=′1′)then if(rst=′1′)then addrin<="00000000"; sysstate<=′1′; elsif(dataout reg="1011101011"and sysstate=′1′)then -- synchronization sysstate<=′0′; addrin<="00000000"; else addrin<=addrin+1; end if; end if; end process; tb ramout addrout:process(clk153p6 MHz) begin if(clk153p6 MHz′event and clk153p6 MHz=′1′)then if(rst=′1′)then addrout<="00000111"; state addrout<="0000"; else case (state addrout)is when "0000"=> state addrout<="0001"; --1 when "0001"=> state addrout<="0010"; --2 when "0010"=> state addrout<="0011"; --3 when "0011"=> state addrout<="0100"; --4 when "0100"=> state addrout<="0101"; --5
when "0101"=> state addrout<="0110"; --6 when "0110"=> state addrout<="0111"; --7 when "0111"=> state addrout<="1000"; --8 when "1000"=> state addrout<="1001"; --9 when "1001"=> state addrout<="0000"; --10 addrout<=addrout+1; when "1010"=> state addrout<="0000"; when "1011"=> state addrout<="0000"; when "1100"=> state addrout<="0000"; when "1101"=> state addrout<="0000"; when "1110"=> state addrout<="0000"; when "1111"=> state addrout<="0000"; when others=> state addrout<="0000"; end case; end if; end if; end process; tb ramout dataout:process(clk153p6 MHz) begin if(clk153p6 MHz′event and clk153p6 MHz=′1′)then if(state delay<=′1′)then --no delay dataout<=dataout reg0; else --up at first,delay one tap dataout<=dataout reg0; --dataout reg1; end if; end if; end process; u ramout1:sramout
port map ( clka=>clk div, dina=>dataout reg, addra=>addrin, wea=>"1", clkb=>clk153p6 MHz, addrb=>addrout, doutb=>dataout reg0); end Behavioral; -----------------------------------------------------------------------