Anyone who has ever been involved in Live OTT production from an engineering standpoint knows…there is a lot to know! You could potentially have twenty cameras feeding streams to a remote production truck that in turn is sending feeds to a broadcast facility. From that broadcast facility those feeds are being encoded into perhaps 10 or more different variants for display on various screen sizes with differing download speeds. Then packaged up and sent on to the content delivery network (CDN) and from there on through the ISP and into the Fire sticks, Roku boxes, Apple TVs, Smart TV’s and you-name-it consumer devices.
There could easily be 10 to 20 or more different vendors involved across the different technologies. A capable engineering team will need to have at least some knowledge across a variety of products including cameras, switches, encoders, origin servers, IP video probes, Windows servers, Linux servers, virtual servers, etc. Oh, and all of these (except maybe the cameras) could be virtualized and running completely in the cloud!
An evolution that has taken place in the industry as we’ve made the move to more all-IP or mostly-IP workflows has been the realization that monitoring is no longer an afterthought that you stick on after the project is designed and build with the money you have left over. Many of us have learned that these kinds of complex projects and workflows simply don’t work over time without accurate and actionable monitoring. And that monitoring has come to be involved in the design architecture from the very beginning.
From a real-world standpoint that means verifying that each vendor along the ‘chain’ will give the operations, engineering and maintenance teams the information they need to determine the health of the flows themselves, the tools involved along the way and the underlying infrastructure that is making it all possible. And all of that information must be accessible remotely. Whether by SNMP, web API, message bus, syslog, or any other protocol there must be a way that an outside entity (NMS or Network Management System) can get to that information and alert people if and when problems are discovered.
The luxury of having multiple screens in the truck or broadcast center for each vendor and their alarms is simply impossible now. There are too many to be of any use. There needs to be a single ‘pane-of-glass’ that will have all alarms and performance data across all technologies throughout the entire operation.
Because of the large number of different technologies and vendors involved it may be helpful to segment the information to be monitored into major buckets.
This means all physical servers and switches. Monitoring basics like CPU load, free memory, available disk space, power supply health, fan health for example. If a disk is full, it can cause havoc on the software applications that rely on it. This might also include those same kinds of metrics on some of the more specific appliance hardware boxes involved like firewalls, vpn concentrators, fiber encapsulators, satellite receivers, compliance monitors, graphics engines, and production switchers to only name a few.