The information here are derive from debug dumps during commissionning shifts. The dump command is

./tfc_test.py debug >& dump_file_name

For details about using the TFC python code, see the commissioning page. It is much simpler to diagnose problems if the TFC is in FORCED_WRITE mode. This is done by (N.B. Currently, all TFC's are initialized into the FORCED_WRITE mode.)


There are some conditions which have occured a number of times. The symptom and the resulting TFC status pages are described in the following

Shifters, please do send me the dumps for these circumstances. The descriptions are here to help diagnose whether a hang is caused by a "known" problem or by something unusual

TFC hangs: History

Most recent entries

Date State Diagnosis/Symptoms Debug File
3/25/03
SENDWAIT/
L1CTT
PCI-A states indicate a bus request/grant but no address cycle started. Assume this is a combination of setup time and use of lm_adrackn to indicate address cycle. Increase setup to 2 cycles and use lm_tsr[?] to indicate address cycle, thereby increasing hold time. debug-tfc0.txt
3/25/03
SENDWAIT/
L1CTT
same as previous debug-tfc1.txt
3/27/03 SENDWAIT/
L1CTT
same as previous tfc75_0debug.txt
3/27/03 SENDWAIT/
L1CTT
same as previous tfc75_0debug2.txt
3/27/03 SENDWAIT/
L1CTT
same as previous tfc75_0debug4.txt
3/27/03 SENDWAIT/
L1CTT
same as previous tfc75_1debug.txt
3/27/03 SENDWAIT/
L1CTT
same as previous tfc75_1debug2.txt
3/27/03 SENDWAIT/
L1CTT
same as previous tfc75_1debug4.txt
4/12/03
16:51
FITWAIT DSPA2 is apparently still fitting, and the EventWriter is waiting for the handshaking from the DSP. All fits scheduled. 0412_fitwait.txt
4/12/03
16:54
FITWAIT same as previous. (DSP A1) 0412_fitwait_2.txt
4/12/03
18:10
FITWAIT same as previous. (DSPA0) fitwait_3.txt
4/12/03
18:13
FITWAIT same as previous. (DSPA0) fitwait_4.txt
4/12/03
08:32
STCREAD/
STCHD/
PROCWAIT
State combination indicates either TFC PCI firmware didn't get start transfer, or no STC trailer word or stc_reader state machine missed PCI disconnect stc_read.txt
4/12/03
08:54
STCREAD PCI-1/2 are apparently hung. This can be caused if the TFC misses a PCI disconnect because the TFC will never relinquish the IRDYn line. True cause unknown stc_read2.txt
4/15/03
FRCREAD/
CTTDAT
Inspecting the L2 fifo data in the debug dump indicates invalid-format PCI data, probably missing the 1st CTT header word. Hang caused by mismatch in supposed CTT object count and number of words in PCI transfer. frc0x70debug.txt
4/16/03
15:39
STCREAD/
STCHD/
PROCWAIT
State combination indicates either TFC PCI firmware didn't get start transfer, or no STC trailer word or stc_reader state machine missed PCI disconnect tfc_debug_stcread0.txt
4/16/03
16:04
STCREAD/
STCHD/
PROCWAIT
same as previous tfc_debug_stcread1.txt
4/16/03
17:42
STCREAD/
STCHD/
PROCWAIT
same as previous tfc_debug_stcread2.txt
4/16/03
17:52
STCREAD/
STCHD/
PROCWAIT
same as previous tfc_debug_stcread3.txt
4/16/03
18:26
STCREAD PCI-1/2 are apparently hung. This can be caused if the TFC misses a PCI disconnect because the TFC will never relinquish the IRDYn line. True cause unknown (See below for cases with more debug information available) tfc_debug_stcread_final.txt
4/16/03
18:30
FRCREAD PCI is hung. No additional information available. (Probably same problem as seen below with more debug information available) tfc_debug_final_pcihang.txt
4/17/03 FRCREAD Lost CTT header problem Inspecting the L2 fifo data in the debug dump indicates invalid-format PCI data. Hang probably caused by mismatch in CTT object count and number of words in PCI transfer. tfc_hung_bus_frcread.0
4/17/03
12:43
FRCREAD Lost CTT header problem Same as previous (Same L2 FIFO data) tfc_hung_bus_frcread.1
4/17/03
13:15
FRCREAD Lost CTT header problem Same as previous (Same L2 FIFO data) tfc_hung_bus_frcread.2
4/17/03
13:28
STCREAD/
STCHD/
XFERDATA
??? Cannot distinguish between a number of possibilities including: (a) missing STC trailer and TFC missed PCI disconnect, (b) TFC missed trailer and PCI disconnect or (c) missed PCI disconnect. Probably are others as well. tfc_hung_bus_frcread.3
4/17/03
15:04
STCREAD/
STCHD/
XFERDATA
Same as preceeding entry tfc_hung_bus_frcread.4
4/17/03
15:11
FRCREAD Lost CTT header problem: Missing 1st of CTT header. Interpreting 2nd word byte 0 as word count=0xb5. tfc_hung_bus_frcread.5
4/17/03
15:25
STCREAD/
STCHD/
XFERDATA
Same as preceeding STCREAD entry tfc_hung_bus_frcread.6
4/17/03
15:39
FRCREAD Counter in CTT header word 0 doesn't match number of words transferred. PCI data looks unlike real CTT data. tfc_hung_bus_frcread.7
4/25/03
10:59
STCREAD/
STCDAT/
CHANDONE
Lost CTT header problemStates indicate PCI transfer completed properly, but stc_reader did not see either STC trailer simulataneously with end-of-PCI. tfc_stcread.txt
5/06/03
12:34
Illegal operating mode (3) Unknown cause. Hoping it was just a hiccup. crate70hung.txt
5/08/03
17:32
STCREAD
STCHD/
PROCWAIT
Insufficient information to diagnose (need operating mode=11) crate70tfc0-debug1-ctt_tv-stc_real.txt
5/08/03
19:48
STCREAD/
STCEND/
XFERDATA
c0c0 problem. The dump indicates that the L2 fifo was filled, and the last STC channel has bad data. The L2 fifo has many apparent STC starts without STC ends. The lm_tsr indicates that the last PCI transfer did stop. crate70tfc0-debug2-ctt_tv-stc_real.txt
5/08/03
20:35
STCREAD/
STCDAT/
XFERDATA
PCI bus hang: PCI-1 and PCI-2 seem to be hung. The dump indicates that the L2 fifo was filled, and the last STC channel has no end-of-data. The lm_tsr indicates that the last PCI transfer never tried to stop crate70tfc0-debug4-ctt_tv-stc_real.txt
5/12/03 FRCREAD/
CTTDAT/
CHANDONE
Lost CTT header problem Crate 75 TFC0_stt5_dump.txt
5/12/03 FRCREAD/
CTTDAT/
CHANDONE
Lost CTT header problemCrate 75 TFC1_stt5_dump.txt
5/12/03 FRCREAD/
CTTDAT/
CHANDONE
Lost CTT header problem Crate 70. Crate70_tfc0_frcread_hang.txt
5/12/03 STCREAD/
STCDAT
XFERDATA
PCI bus hangCrate 70, bus hang. Debug dump shows no STC trailer but L2 fifo full and transfer still active (including valid data flag from datataker). The channel read out order was changed, and problem stayed with physical channel. Crate70_inputs_swapped.txt
5/16/03 STCREAD/
STCDAT/
XFERDATA
PCI bus hang Error generated repeatedly at Stony Brook. Logic analyzer shows no end of PCI transfer for the 1st STC channel. (And a good transfer for comparison.) The run conditions are described in detail below tfchang_sbtest_non_stop.txt
5/19/03 FRCREAD/
CTTH1/
CHANDONE
PCI transfer terminated via disconnect with data after only 2 words. The FRC reader properly hung because of the logically incomplete event TFC0stt5.txt
5/19/03 STCREAD Dump file incomplete. No diagnosis possible TFC1stt5.txt
5/20/03
23:30
STCREAD/
STCDAT/
CHANDONE
c0c0 Problem: Crate71_3cards_1.txt
5/20/03
23:46
STCREAD/
STCBLK/
CHANDONE
Transfer apparently either not requested or finished with no words transferred Crate71_3cards_2.txt
6/17/03 STCREAD/
STCPROC
CHANDONE
c0c0 Problem: stcread40.txt
6/17/03 STCREAD/
STCPROC
CHANDONE
c0c0 Problem: stcread40_b.txt
6/17/03 STCREAD/
STCPROC
CHANDONE
c0c0 Problem: stcread41.txt
6/17/03 STCREAD/
STCPROC
CHANDONE
c0c0 Problem: stcread41_b.txt
6/18/03
3:19
FRCREAD TFC in FRCREAD with no data present on inputs. tfc00_18June13.19am.txt / tfc01_18June13.19am.txt
6/18/03
11:15
??? Is there actually a problem here? tfc01_18June11.15am.txt
6/18/03
14:43
STCREAD STC channel missing input data tfc20_18June14.43pm.txt
6/18/03
11:17
??? Is there actually a problem? tfc40_18June11.17am.txt / tfc41_18June11.17am.txt
6/18/03
11:59
STCREAD/
STCHD
CHANDONE
c0c0 Problem: tfc40_18June11.59am.txt/ tfc41_18June11.59am.txt
6/19/03
14:35
FRCREAD TFC in FRCREAD with no data present on inputs. tfc00_19June14.35pm.txt / tfc01_19June14.35pm.txt
6/19/03
14:39
FRCREAD Lost CTT header problemCrate 70 tfc00_19June14.39pm.txt / tfc01_19June14.39pm.txt
6/19/03
11:23
STCREAD and PCI-1/2 hang TFC20/21 apparently in STCREAD and PCI-1/2 are hung tfc20_bushang_19June11.23am.txt / tfc21_bushang_19June11.23am.txt / tfc20_bushang_19June11.30am.txt / tfc21_bushang_19June11.30am.txt
6/19/03
14:30
FRCREAD TFC in FRCREAD with no data present on inputs. tfc40_19June14.30pm.txt / tfc41_19June14.30pm.txt
6/19/03
11:12
STCREAD STC channel missing input data tfc50_19June11.12am.txt / tfc51_19June11.12am.txt
6/19/03
11:27
STCREAD STC channel missing input data tfc50_19June11.27am.txt
6/19/03
11:37
STCREAD STC channel missing input data tfc50_19June11.37am.txt
6/19/03
14:34
FRCREAD TFC in FRCREAD with no data present on inputs. tfc51_19June14.34pm.txt
11/03/03 FRCREAD/
CTTDAT/
CHANDONE
Lost CTT header problem Crate 70 tfc50.debug.txt
11/03/03 FRCREAD/
CTTDAT/
CHANDONE
Lost CTT header problem Crate 70 tfc51.debug.txt
11/05/3 FRCREAD/
CTTDAT/
CHANDONE
Lost CTT header problem Crate 70 tfc20debug_183860.txt
tfc21debug_183860.txt
11/15/03 STCREAD (2) Both TFC's in crate 70 in STCREAD. No debug dump, TFC mode=15. All LTB channels have different numbers of events, and none are in overflow. Did the STC's decay away? No further diagnosis/speculation possible. tfc00.txt tfc01.txt
11/16/03 DSP Hang, TFC7x DSP B0 in ODPM state, and DSL1 in LOAD state. Probable cause is DSP B0 hung. Never seen before. tfc1_stat_73_5.log
11/17/03
19:28
PCI-3 hang, TFC51 (2) PCI-C bus hang. Status only, no debug dump. No further diagnosis possible tfc51_status.txt
11/17/03
21:08
PCI-1/2 hang, TFC41 (2) TFC41 Probable PCI-B hang. Debug included shows no obvious problem in TFC tfc41_debug.txt
11/17/03
21:41
PCI-1/2 hang, TFC41 (2) TFC41 Probable PCI-B hang. Debug included shows no obvious problem in TFC tfc41_debug.txt
11/17/03 PCI-1/2 hang(?), TFC30 (2) TFC30 Debug only, no status. Apparently an STCREAD problem. No further diagnosis possible. **POSSIBLY CONSISTENT WITH PCI-1/2 HANG** tfc30_debug.txt
11/18/03
17:26
DSP Hang? DSP B0 in ODPM state, and DSL1 in LOAD state. Probable cause is DSP B0 hung. Never seen before. tfc31_hang_11_18.txt
11/18/03
21:26
PCI-1/2 hang, TFC40(2) TFC 40 apparently in STCREAD and PCI-1/2 are hung tfc40.debug.txt
tfc40.stt4.txt / tfc41.stt4.txt
11/18/03
22:22
PCI-3 hang, TFC41 (2) PCI-C bus hang in TFC41 slot. The debug dumps clearly show a bus hang, but no apparent problem with the TFC L3 related state machines. tfc41.debug.txt / tfc41.stt4.txt
11/18/03
23:00
PCI-3 hang 5x,
L3 read out phase problem (2)
Both TFC's in crate 75 have same L3 read out hang. For both, last word transferred was FIT(last). Further diagnosis impossible. tfc51_kb.txt tfc51_kb_2.txt
11/18/03 STCREAD
PCI-1/2 hang?
TFC in STCREAD state. Buses maybe OK, but could also be that status came after an SCL_INIT. Mode=15, no further diagnosis possible. tfc41_debug_kevin.txt
11/19/03 PCI-3 hang TFC51,
L3 read out phase problem (2)
TFC #1 in crate 75 has 2 L3 read out hangs. For both, last word transferred was FIT(last). tfc51_pci_kb.txt / tfc51_pci_kb_2.txt
12/02/03 FRCREAD
(TFC's 20 & 21 )
Both TFC's in crate enter FRCREAD for 1st event in. Internal states identical, including data in. The FRC LRB shows 16 events pending, and the STC LRB's show 8 events pending. This is not understood, but as it happened to both TFC's, it is likely an input problem. Note also that the following entry shows different problems occuring in crate x73 at the same time as these. tfc20_frcread_6_07pm_dec2.txt / tfc21_frcread_6_07pm_dec2.txt
12/02/03 STCREAD
(TFC's 30 & 31)
Both TFC's in crate 0x73 enter STCREAD soon after init(?). In all cases, at least one of the STC input channels reports no events. Presumably this is indicating an upstream problem causing some STC channels to send no data. tfc30_stcread_5_56pm_dec2.txt
tfc30_stcread_6_07pm_dec2.txt
tfc30_stcread_6_07pm_dec2_b.txt
tfc31_stcread_5_56pm_dec2.txt
tfc31_stcread_5_56pm_dec2_b.txt
tfc31_stcread_6_07pm_dec2.txt
tfc31_stcread_6_07pm_dec2_b.txt

The "State" cell background colors indicate whether or not the problem has been addressed and/or whether enabling error recovery would prevent the hang.

Light Green TFC firmware bug found and fixed
Dark Green Hardware fault found or non TFC bug found and fixed
Pink Missing input data
Yellow Error recovery will allow processing to continue, though perhaps with nonsense data (check the error flags). Problem believed to be input data errors.
Orange Error recovery will address symptom. True cause unknown.
Red Unknown source
default No diagnosis yet

Details for some of the crash dumps