How Video Surveillance Quietly Eats Up Your Network
When There Are More Cameras Than It Seems: How to Properly Design a Video Surveillance System from 5 to 150 Cameras
There is one mistake that is made surprisingly often in video surveillance projects. The system is calculated by the number of cameras rather than by the number of streams. On paper, this looks harmless enough. Five cameras, ten cameras, fifty cameras. It seems as though that alone defines the scale of the system. But in reality, modern video surveillance runs on very different math. Today, a camera is rarely just one camera in the old-fashioned sense. It almost always delivers multiple streams and works simultaneously for recording, live viewing, analytics, remote access, sometimes mobile clients, sometimes cloud services, and sometimes several different server roles at once. As a result, a project that looked like “ten cameras” in the customer’s mind may, in terms of load and architecture, actually behave like a system with twenty, thirty, or forty video streams running at the same time.
That is why proper design starts not with the question “how many cameras,” but with “how many streams are there, who consumes them, where do they cross the network, and where is the decision made about what happens to the video next.” This is not some abstract engineering subtlety. It is the difference between a system that runs calmly for years and one that starts getting nervous at the exact moment reliability matters most.
Why the Number of Cameras Almost Never Reflects the Real Load
In most modern systems, at least two streams are used from each camera. The main stream is needed for archive recording, detailed viewing, and investigations. The secondary stream is usually used for camera grids, remote viewing, mobile clients, and sometimes certain types of detection. At this stage alone, ten cameras stop being ten units of load. They become at least twenty video streams living inside the system at the same time.
But even that is only the beginning. If the system includes multiple workstations, analytics, server-side recording, archive viewing, event delivery, cloud backup, or remote access, then the streams begin to multiply. The same video signal may be recorded, shown to an operator, used by an analytics module, delivered to a remote client, and saved as an alarm clip. Formally, there are still ten cameras. In practice, the system is already behaving like a small media complex.
That is why a modern VMS must be flexible not only in terms of the number of cameras, but also in terms of deployment scale. The same software platform today must work just as confidently in a simple home, office, or small shop setup as it does in a large multi-user, multi-server system with multiple sites, distributed recording, remote workstations, and advanced analytics. That is one of the key differences between a mature platform and “software for viewing cameras.”
For example, SmartVision can be used both for compact local installations and for more complex multi-server systems. And this is not limited to classic video analytics. In addition to recognizing objects, faces, license plates, smoke, fire, and other standard scenarios, the system also supports audio analytics, can recognize more than 500 types of sounds, and can transcribe speech in video. This means that the architecture must take into account not only video streams, but also additional layers of event processing, notifications, recording, and system response. In other words, modern video surveillance no longer works only with images. More and more often, it works as a system for interpreting what is happening based on image, sound, and a set of response rules.
Because of this, the most dangerous design mistake is not choosing the wrong camera or the wrong server. The most dangerous mistake is oversimplification. When a designer counts only cameras rather than the real video paths, they are not designing a system. They are designing a hope that everything will not be used too actively at the same time.
Where Design Really Has to Begin
The correct logic is fairly simple. First, you need to understand what types of streams exist in the system. Then determine exactly who consumes them. After that, you break the load down by segments: from the camera to the recording server, from the server to the workstations, between servers, to analytics, to remote users, and to external services. Only then should you choose hardware, network speed, server types, and the degree of distribution.
This matters because problems in video surveillance do not arise only from a lack of computing power. Very often, the weak point is the structure of the system itself. Streams follow unnecessary paths, multiple clients pull video directly from the camera, analytics reads the stream separately, the recording server and the viewing server compete for the same resources, and then suddenly everything together runs into network limits, disk performance, or decoding capacity. And at that point it becomes clear that the most expensive component in the project usually appears exactly where the wrong design was noticed too late.
Small System: Home, Office, Small Facility, 5 to 15 Cameras
For small systems, a simple and clean architecture is almost always the best solution. One good server or one powerful PC running the VMS, a solid local network, one decent switch with headroom, recording the main stream, and using the secondary stream for camera grids and remote viewing. At this scale, the main virtue of the system is not sophistication, but predictability.
If there are only a few users, if remote access is occasional, and if analytics is either absent or limited to basic scenarios, then splitting the system across many servers is not cost-effective. It increases complexity, maintenance, and the number of failure points, but brings no real savings. Put simply, for five to ten cameras, a dedicated restream server is about as necessary as an excavator for repotting a flower. Formally possible, practically odd.
In a small system, money is usually better spent elsewhere. On decent cameras rather than the cheapest ones. On a quality archive drive. On extra memory. On a quiet and reliable server. On proper bitrate tuning and recording with sensible profiles. On convenient remote access that goes through the VMS rather than directly into the camera. That is what gives the best price-to-performance result.
A Local System Without a Complicated External Life
There are sites where video surveillance must work autonomously. One office, a small warehouse, a private house, a local network, archive stored internally, remote connections are rare. Here, the architecture should be as short as possible. The camera sends the main stream for recording, the secondary stream for viewing, and the rest of the logic should not create extra copies of video unless absolutely necessary.
These systems make one thing especially clear: not every problem is solved by adding another server. If traffic inside the site is organized logically, if streams are not duplicated without reason, if clients do not connect directly to cameras, and if recording and viewing go through one clear point, then even a modestly sized system runs calmly. But if the architecture is sloppy, then even with ten cameras you can get the feeling that the infrastructure is already nearing retirement.
Mid-Sized System: 15 to 40 Cameras and Several Users
When the number of cameras grows and there are several users, the system moves into another league. Now you have to think not only about recording, but also about how to deliver video to clients without putting unnecessary pressure on the cameras. If several workstations simultaneously open camera grids, archives, alarms, and full-screen views, then the camera should not turn into the source of many independent connections. That is a bad habit best stopped at the architecture stage.
For this range, the most effective design is usually one main recording and management server, plus one clear rule: clients should receive both live video and archive through the system, not through direct connections to each individual camera. This reduces chaos, simplifies control, makes system behavior more predictable, and gives a clearer picture of where the load actually appears.
If such a system has a lot of remote connections, it is already worth thinking about separating roles. There is no need to multiply servers just for appearance, but logically separating recording, user access, and analytics can be very useful. Especially if the site does not operate on the model of “one guard staring at one monitor,” but as a fully active system with several workstations, alarm events, and constant switching between operating modes.
Large System: 40 to 80 Cameras and Several Usage Zones
In the range from forty to eighty cameras, architecture starts to matter more than the raw power of one server. A common mistake in such projects is trying to compensate for poor stream organization with expensive hardware. It works, but it is not very smart. If all video is pulled into one point and then distributed from that same point to everyone, you can buy a very powerful server and get an impressive specification sheet. Or you can first reduce unnecessary stream movement and get the same practical benefit for less money.
If the site consists of several zones, buildings, or locations, it is often more efficient to process and record video closer to the source, and then transfer only what is actually needed: live viewing, events, metadata, alarm clips, and specific archive requests. This becomes especially important where remote workstations and external communication channels are a permanent part of the system’s life rather than a rare exception.
At this stage of design, the answer to the question “what is more cost-effective, a faster network or additional servers” usually sounds like this: inside one site, a good network is almost always the better investment, while between sites, proper role distribution and local recording are more effective. Otherwise, the project starts to resemble a warehouse management problem being solved not by better planning, but by buying a larger truck.
System with 80 to 150 Cameras: Multi-User, Multi-Server, with Analytics
When there are many cameras, many users, advanced analytics, and several remote workstations, the idea of one central server starts looking elegant only until the first serious expansion. Yes, one very powerful server can support a substantial system. But the question is not only whether it can. The question is whether it is cost-effective, convenient, and resilient in real operation.
One big server is attractive because of its simplicity at the start. But it has obvious weaknesses. It becomes a single point of overload. It becomes a node whose upgrade is expensive and unpleasant. It makes the system less flexible. And the more roles are piled onto it, the more it starts to resemble an employee who should have been given assistants long ago, but instead just got a bigger desk.
In large systems, separation by function is usually more effective. Separate nodes for recording, separate ones for heavy analytics, separate ones for client access, separate ones for distributed sites. But that does not mean you should scatter a lot of very cheap servers around the site handling five to ten cameras each and call that an economic victory. That design only makes sense where there is a clear reason: separate buildings, separate responsibility zones, local analytics near the cameras, autonomy requirements, poor connectivity between areas, or independent security zones.
If you simply replace one large server with a pile of cheap machines, it is very easy to end up with a system that looks cheaper at the start but costs more in maintenance, updates, redundancy, diagnostics, and support. That is why in large projects, the best solution is usually somewhere in the middle. Not one monster server, and not a zoo of small boxes, but several mid-range or high-end nodes, each performing a clear role.
What Is More Cost-Effective: A Faster Network or Additional Restream Servers?
This is one of the most common questions, and the universal answer is this: if the problem is inside one site, a better network is usually the better investment. If the problem is a distributed structure with many remote consumers, then properly separating roles and streams is usually the better approach.
A faster and properly organized network is useful almost all the time. If the system has weak backbone links, bottlenecks between cameras, the server, and workstations, and all critical streams converge in one narrow point, then additional servers do not solve the root problem. They just add extra stops to an already overloaded road.
On the other hand, if the system is distributed, if there are many remote users, and if the same streams must serve different sites and different roles, then separate server nodes or dedicated delivery roles may be justified. But their purpose is not to replace a good network. Their purpose is to avoid sending extra video where it is not constantly needed.
In other words, the right answer almost never sounds like “either network or servers.” First, remove meaningless stream movement. Then provide a solid network. Only after that should you add server roles where they truly reduce load and make the system more stable.
One Powerful Central Server or Many Inexpensive Servers
For small and mid-sized systems, one powerful server is usually more cost-effective. It is simpler, clearer, cheaper to maintain, easier to back up at the image and configuration level, and does not require supporting a whole collection of separate nodes.
But as the system grows, the picture changes. When advanced analytics, several independent zones, remote workstations, and high user activity appear, several mid-range servers often become a more sensible choice. They allow gradual scaling, make it easier to distribute load, simplify problem isolation, and avoid turning every upgrade into a separate engineering event complete with nerves and night shifts.
Still, there is a limit to common sense. Many inexpensive servers work effectively only when there is a clear logic behind them. Without that logic, the system starts losing against itself. Savings on hardware are quickly eaten up by support overhead. It is like buying not one good truck, but six old minivans and telling yourself every day that this is somehow more exciting.
The Role of Video Analytics in Choosing the Architecture
Video analytics changes everything faster than many projects account for. As long as the system only records archive and displays camera grids, it lives by one model. As soon as facial recognition, license plate recognition, object recognition, behavioral scenarios, metadata search, alarm rules, and automated actions appear, the architecture immediately becomes far more demanding.
The main question here is not “is there analytics or not,” but “which stream does it use, and where exactly is it performed.” If analytics takes the main high-resolution stream, requirements rise sharply. If it runs on a separate server, the load shifts there. If part of the analytics runs close to the source and only events and results are sent to the system, the load may be lower. But then the requirements for the cameras or edge devices themselves become higher.
That is why for systems with advanced analytics, it is almost always better to design role separation from the start. There is no need to build unnecessary complexity, but it is definitely unwise to throw recording, analytics, multi-user access, and remote connections into one point just because that is easier to draw on a diagram.
Where the Best Price-to-Performance Balance Is Usually Found
For a home, a small office, or a small site, the sweet spot almost always lies in a simple local design: one quality server, a good network, reasonable disk headroom, and sensible stream configuration.
For the 15 to 40 camera range, the sweet spot is usually a clean centralized system with server-side stream delivery rather than direct client connections to cameras, plus a clear understanding of where separate load from remote users and analytics begins.
For the 40 to 80 camera range, the sweet spot is already tied to architecture: local processing where possible, sending outward only what is needed, separating roles, and avoiding unnecessary video movement inside the system.
For 80 to 150 cameras and beyond, the sweet spot is almost never achieved with one extremely powerful server and not with a crowd of cheap nodes, but with several thoughtfully distributed servers, each handling its own clear function and its own part of the stream load.
What Really Matters in the End
A good video surveillance system is designed from streams, not from cameras. Not from a nice equipment list, but from video paths. Not from a desire to simplify the diagram, but from an understanding of where real load actually appears. If you first organize stream movement properly, then provide a solid network, then separate server roles, and only after that choose the required level of computing power, the system usually turns out both stable and economically sensible.
And the opposite is also true. If you first buy very powerful hardware and only then try to make it digest video that should never have passed through that point in the first place, the project becomes an expensive way of hiding an architectural mistake.
Modern video surveillance has long stopped being just an archive with cameras. It is a system where video is recorded, displayed, analyzed, transmitted, compared, involved in automation, and used by many different scenarios at the same time. That is why the main design question is no longer “how many cameras.” The real question sounds different: “how many streams actually live in the system at the same time, who uses them, and how intelligently are they organized?”