Hi folks!
We have changed the BORs pod to another account and it is crashing ever since with an unknown registry: BorsNG.PubSub. Either the registry name is invalid or the registry is not running, possibly because its application isn't started
error (the detailed error can be found in the end of this message).
It seems this error is related to registry information being available in memory and after investigating the status of the EC2 instance that used to host the service, we noticed that the error occurred very closely to the moment when the spot instance was reclaimed. So, we changed the instance to on-demand and the crash started to happen once a day instead of twice.
We tried to compare the settings used in this new account against the other one, and the only difference is the instance type:
- c5.4xlarge in the old account
- r4.8xlarge in the new one
Additional information:
- 6% of the CPU is being used according to Cloudwatch
- The RDS also looks healthy without any CPU or memory peak
Has someone ever faced this kind of error?
Any suggestion will be very appreciated!
Here is the detailed error:
{%ArgumentError{
message: "unknown registry: BorsNG.PubSub. Either the registry name is invalid or the registry is not running, possibly because its application isn't started"
},
[
{Registry, :meta, 2,
[file: 'lib/registry.ex', line: 1068]},
{Phoenix.PubSub, :broadcast, 4,
[file: 'lib/phoenix/pubsub.ex', line: 148]},
{Phoenix.PubSub, :broadcast!, 4,
[file: 'lib/phoenix/pubsub.ex', line: 241]},
{BorsNG.Worker.Batcher, :maybe_complete_batch, 1,
[file: 'lib/worker/batcher.ex', line: 692]},
{BorsNG.Worker.Batcher, :poll_, 1,
[file: 'lib/worker/batcher.ex', line: 297]},
{BorsNG.Worker.Batcher, :handle_info, 2,
[file: 'lib/worker/batcher.ex', line: 237]},
{:gen_server, :try_handle_info, 3,
[file: 'gen_server.erl', line: 1077]},
{:gen_server, :handle_msg, 6,
[file: 'gen_server.erl', line: 1165]}
]}