Skip to content
Snippets Groups Projects
Commit d3d52753 authored by Thomas Juerges's avatar Thomas Juerges
Browse files

First stab at an LTS cold start script

Important:  This needs new MPs to be added to the PCC device.
Review:  JD & Paulus for correctness since I had to wing it a bit.
parent d38353a7
No related branches found
No related tags found
1 merge request!22L2SS-170: First stab at an LTS cold start script
#! /usr/bin/env python3
import logging
from time import sleep
from .startup import startup
from .lofar2_config import configure_logging
def start_device(device: str):
'''
Start a Tango device with the help of the startup function.
The device will not be forced to got through
OFF/INIT/STANDBY/ON but it is assumed that the device is in OFF
state. If the device is not in OFF state, then an exception
will be raised.
'''
dev = startup(device = device, force_restart = False)
state = device.state()
if state is not tango._tango.DevState.ON:
raise Exception("Device \"{}\" is unexpectedly in \"{}\" state but it is expected to be in \"{}\" state. Please check the reason for the unexpected device state. Aborting the start-up procedure.".format(device, state, tango._tango.DevState.ON))
return device
def lts_cold_start():
'''
What is this?
This is the LTS (LOFAR Test - and I forgot what S stands for) cold start
procedure cast into source code. The procedure can be found there:
https://support.astron.nl/confluence/display/L2M/LTS+startup+procedure
Paulus wrote already a script that - illegally ;) - makes direct use of the
OPC-UA servers to accomplish the same thing that we are doing here.
Paulus' script can be found there:
https://git.astron.nl/lofar2.0/pypcc/-/blob/master/scripts/Startup.py
Thanks, Paulus! You made it very easy for me to cobble together this
script.
For obvious reasons is our script much better though. :)
First, it is bigger. And bigger is always better.
Then it is better documented but that does not count in the HW world.
But it also raises exceptions with error messages that make an attempt to
help the user reading them and shuts down the respective Tango device(s) if
something goes south.
And that is where we try to do it really right: there is no reason to be
excessively verbatim when things work like they are expected to work. But
tell the user when something goes wrong, give an indication of what could
have gone wrong and where to look for the problem.
Again, Paulus' script contains already very good indications where problems
might lie and made my job very easy.
No parameters, parameters are for wimps. :)
'''
# Define the LOFAR2.0 specific log format
configure_logging()
# Get a reference to the PCC device, do not
# force a restart of the already running Tango
# device.
pcc = startup("LTS/PCC/1")
# Getting CLK, RCU & RCU ADCs into proper shape for use by real people.
#
# The start-up needs to happen in this sequence due to HW dependencies
# that can introduce issues which are then becoming very complicated to
# handle in SW. Therefore to keep it as simple as possible, let's stick
# to the rule recommended by Paulus:
# 1 CLK
# 2 RCU
# 3 RCU ADCs
#
#
# First take the CLK board through the motions.
# 1.1 Switch off CLK
# 1.2 Wait for CLK_translator_busy_R == True, throw an exception in timeout
# 1.3 Switch on CLK
# 1.4 Wait for CLK_translator_busy_R == True, throw an exception in timeout
# 1.5 Check if CLK_PLL_locked_R == True
# 1.6 Done
#
#
# Steps 1.1 & 1.2
pcc.CLK_off()
# 2021-04-30, Thomas
# This should be refactored into a function.
timeout = 10.0
while pcc.CLK_translator_busy_R is True:
logging.debug("Waiting on \"CLK_translator_busy_R\" to become \"True\"...")
timeout = timeout - 1.0
if timeout < 1.0:
# Switching the PCC clock off should never take longer than
# 10 seconds. Here we ran into a timeout.
# Clean up and raise an exception.
pcc.off()
raise Exception("After calling \"CLK_off\" a timeout occured while waiting for \"CLK_translator_busy_R\" to become \"True\". Please investigate the reason why the PCC translator never set \"CLK_translator_busy_R\" to \"True\". Aborting start-up procedure.")
sleep(1.0)
# Steps 1.3 & 1.4
pcc.CLK_on()
# Per Paulus this should never take longer than 2 seconds.
# 2021-04-30, Thomas
# This should be refactored into a function.
timeout = 2.0
while pcc.CLK_translator_busy_R is True:
logging.debug("After calling \"CLK_on()\" Waiting on \"CLK_translator_busy_R\" to become \"True\"...")
timeout = timeout - 1.0
if timeout < 1.0:
# Switching the PCC clock on should never take longer than
# a couple of seconds. Here we ran into a timeout.
# Clean up and raise an exception.
pcc.off()
raise Exception("After calling \"CLK_on\" a timeout occured while waiting for \"CLK_translator_busy_R\" to become \"True\". Please investigate the reason why the PCC translator never set \"CLK_translator_busy_R\" to \"True\". Aborting start-up procedure.")
sleep(1.0)
# 1.5 Check if CLK_PLL_locked_R == True
# 2021-04-30, Thomas
# This should be refactored into a function.
clk_locked = pcc.CLK_PLL_locked_R
if clk_locked is True:
logging.info("CLK signal is locked.")
else:
# CLK signal is not locked
clk_i2c_status = pcc.CLK_I2C_STATUS_R
exception_text = "CLK I2C is not working. Please investigate! Maybe power cycle subrack to restart CLK board and translator. Aborting start-up procedure."
if i2c_status <= 0:
exception_text = "CLK signal is not locked. Please investigate! The subrack probably do not receive clock input or the CLK PCB is broken. Aborting start-up procedure."
pcc.off()
raise Exception(exception_text)
# Step 1.6
# Done.
# 2 RCUs
# If we reach this point in the start-up procedure, then the CLK board setup
# is done. We can proceed with the RCUs.
#
# Now take the RCUs through the motions.
# 2.1 Set RCU mask to all available RCUs
# 2.2 Switch off all RCUs
# 2.3 Wait for RCU_translator_busy_R = True, throw an exception in timeout
# 2.4 Switch on RCUs
# 2.5 Wait for RCU_translator_busy_R = True, throw an exception in timeout
# 2.6 Done
#
#
# Step 2.1
# We have only 8 RCUs in LTS.
pcc.RCU_mask_RW = [True, ] * 8
# Steps 2.2 & 2.3
pcc.RCU_off()
# 2021-04-30, Thomas
# This should be refactored into a function.
timeout = 10.0
while pcc.RCU_translator_busy_R is True:
logging.debug("Waiting on \"RCU_translator_busy_R\" to become \"True\"...")
timeout = timeout - 1.0
if timeout < 1.0:
# Switching the RCUs off should never take longer than
# 10 seconds. Here we ran into a timeout.
# Clean up and raise an exception.
pcc.off()
raise Exception("After calling \"RCU_off\" a timeout occured while waiting for \"RCU_translator_busy_R\" to become \"True\". Please investigate the reason why the PCC translator never set \"RCU_translator_busy_R\" to \"True\". Aborting start-up procedure.")
sleep(1.0)
# Steps 2.4 & 2.5
# We leave the RCU mask as it is because it got already set for the
# RCU_off() call.
pcc.RCU_on()
# Per Paulus this should never take longer than 5 seconds.
# 2021-04-30, Thomas
# This should be refactored into a function.
timeout = 5.0
while pcc.RCU_translator_busy_R is True:
logging.debug("After calling \"RCU_on()\" Waiting on \"RCU_translator_busy_R\" to become \"True\"...")
timeout = timeout - 1.0
if timeout < 1.0:
# Switching the RCUs on should never take longer than
# a couple of seconds. Here we ran into a timeout.
# Clean up and raise an exception.
pcc.off()
raise Exception("After calling \"RCU_on\" a timeout occured while waiting for \"RCU_translator_busy_R\" to become \"True\". Please investigate the reason why the PCC translator never set \"RCU_translator_busy_R\" to \"True\". Aborting start-up procedure.")
sleep(1.0)
# Step 2.6
# Done.
# 3 ADCs
# If we get here, we only got to check if the ADCs are locked, too.
# 3.1 Check RCUs' I2C status
# 3.2 Check RCU_ADC_lock_R == [True, ] for RCUs that have a good I2C status
# 3.3 Done
#
#
# Steps 3.1 & 3.2
# 2021-04-30, Thomas
# This should be refactored into a function.
rcu_i2c_status = numpy.array(pcc.RCU_I2C_STATUS_R)
i2c_ok_rcus = numpy.where(rcu_i2c_status == 0, True, False)
i2c_not_ok_rcus = ~i2c_ok_rcus
adc_locked = pcc.RCU_ADC_lock_R
rcu = -1
for status in i2c_ok_rcus:
rcu = rcu + 1
if status is True:
logging.debug("RCU #{} is available.".format(rcu))
adc = -1
for lock in adc_locked[rcu]:
adc = adc + 1
if lock[adc] is False:
logging.warning("RCU #{} ADC #{} is unlocked. Please investigate! Will continue with normal operation.".format(rcu, adc))
else:
# The RCU's I2C bus is not working.
pcc.RCU_mask_RW[rcu] = False
logging.warning("RCU #{}'s I2C is not working. Please investigate! Disabling RCU #{} to avoid damage.".format(rcu, rcu))
# Step 3.3
# Done
# Start-up APSCTL, i.e. Uniboard2s.
aps = startup("APSCTL/SDP/1")
logging.warning("Cannot start-up APSCTL because it requires manual actions.")
# Start up SDP, i.e. configure the firmware in the Unibards
sdp = startup("LTS/SDP/1")
logging.warning("Cannot start-up SDP because it requires manual actions.")
logging.info("LTS has been successfully started and configured.")
if __name__ == '__main__':
lts_cold_start()
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment