Maintenance
This page describes troubleshooting and best practices for maintaining hardshare configurations and shared devices.
New instances will not launch, even though the deployment is being advertised. Why?
Unrecoverable errors during INIT
or TERMINATING
cause the deployment to be locked. This is shown as lock-out: true
in the listing from hardshare list
, e.g.,
registration details for workspace deployments in local config:
7ec9c3d2-6a74-47d9-bf9f-b3ff41c26ec0
created: 2021-02-26 01:29 UTC
desc: CubeCell with OLED
origin (address) of registration: (unknown)
lock-out: true
When locked, new instance requests are rejected. To allow instances again,
hardshare unlock
How to run the client without keeping a terminal open?
Run it with GNU Screen or tmux. For example,
tmux new-session hardshare ad \; detach
will start ad
. If there is no need to re-attach to the process, then a more simple solution is nohup (or in the FreeBSD manual).
nohup hardshare ad &
Then, exit the shell normally.
How to log client output to a file?
For example,
hardshare -v ad 2>&1 | tee -a hs.log
will run ad
with verbose logging and pipe through tee, which print logs to the screen and appends to a file named hs.log.
How to find all hardshare processes?
To find all relevant processes
ps -AHF | grep -i hardshare
Beware that this can return several processes that include "hardshare" in their arguments but are not hardshare processes per se. The left-most number in each returned row is the PID. These processes can be killed via kill
or kill -SIGINT
. The flags to ps
may be different on your host. On Mac, try
ps -ef | grep -i hardshare
After first installation, instance status INIT_FAIL
There are many reasons why an instance can fail to initialize, depending on your configuration. For a newly configured hardshare installation that uses Docker, first check that the Docker image is compatible with your host architecture. To do this, first
hardshare list
and find the Docker image line; for example,
cprovider: docker
cargs: []
img: rerobots/hs-generic
indicates the image rerobots/hs-generic:latest
("latest" is implied if not present). Now, get your host architecture as known to Linux
# uname -m
x86_64
The output might be different, such as armv7l
on some Raspyberry Pi boards. Continuing the example above, we can pull the base generic Docker image for x86_64 hosts
docker image pull rerobots/hs-generic:x86_64-latest
and update the hardshare configuration with the tag name
hardshare config --assign-image rerobots/hs-generic:x86_64-latest
Now unlock the deployment, and restart the hardshare daemon
hardshare stop-ad
hardshare unlock
hardshare ad
Finally, request an instance as usual.
Alternatively, build a new image on your host using files under the directory devices/ of the sourcetree.
Daemon fails to start or is not responsive
List local configurations
hardshare --format=yaml list
local:
err_api_tokens: {}
api_tokens:
- /home/scott/.rerobots/tokens/jwt.txt
ssh_key: /home/scott/.ssh/unodist
version: 0
wdeployments:
- cargs: []
container_name: rrc
cprovider: podman
id: b47cd57c-833b-47c1-964d-79e5e6f00dba
image: hs-generic
init_inside: []
owner: scott
terminate: []
remote:
deployments:
- date_created: 2020-05-25 06:27 UTC
id: b47cd57c-833b-47c1-964d-79e5e6f00dba
origin: null
owner: scott
Start, check, and stop daemons
hardshare ad
hardshare status
hardshare stop-ad
Update API tokens
Remove any expired API tokens
hardshare config -p
Then, get a new API token, and add it
hardshare config --add-token path/to/your/jwt.txt
Manage wdeployment IDs
With the hardshare client, you can freely create and destroy workspace deployments. This process corresponds to creating or destroying a unique ID. Here, "destroying a unique ID" means that the corresponding workspace deployment is marked as permanently unavailable.
When some part of robot or the surrounding environment changes significantly, the unique ID should be changed. What is "significant" or not depends on the context. For example, removing a LiDAR sensor is likely significant, but small changes to overhead lighting might not be.
Ensuring that unique IDs correspond to a known setting is a best practice because it facilitates automation. For example, automated tests can assume that, if the same ID is referenced, then the testing fixture with real hardware is the same (up to some tolerance).