EEVblog Electronics Community Forum
General => General Technical Chat => Topic started by: mikeselectricstuff on December 11, 2024, 11:16:40 am
-
I'm looking at new project which will have a large number (100-200) of nodes ( probably based on RasPi compute module) that need to be linked with ethernet.
For various reasons, it's not practical to have each one going to a switch, we ideally want them in a number of series chains.
I'm looking at each node having a switch chip integrated, e.g.
https://www.lcsc.com/product-detail/span-style-background-color-ff0-Ethernet-span-Switches_IC-Plus-IP175G_C80220.html (https://www.lcsc.com/product-detail/span-style-background-color-ff0-Ethernet-span-Switches_IC-Plus-IP175G_C80220.html)
(as it's the cheapest documented switch chip I could immediately find)
This onboard switch would have 3 ports - one to the node, and an in & out connecting to adjacent nodes. Possible more to allow branching.
We'd probably be looking at a multi-level layout, starting with a gbit switch, then maybe ten 100mbit feeds from there, each feeding a chain of 10-20 nodes
We don't care much about bandwidth, minimum cost and cabling is the priority. It's a closed system so the host can deal intelligently with the bandwidth available , e.g. sequencing across the gbit switch ports to maximise use of the 100m links from the 1G feed.
Protocol would probably just be UDP. If we need broadcast, this would only be from the host to the nodes.
If necessary we can arrange for nodes to be powered up sequentially to reduce any floood of traffic at startup.
Nodes would most likely have fixed IP addresses so we shouldn't need DHCP
Is there anything intrinsic to ethernet preventing a series string of, say, 15 switches working reliably ?
-
I vaguely remember a number of 4 maximum chained switches, from a very thick book about networks and Windows NT servers, back in the 90's. Had to do something with propagation and delays/lag. Those were certainly not Gbit switches, I guess 100Mbit, if not 10, can't say. I might still have the book somewhere.
For 100+ nodes I would rather opt for serial over RS-485 if the application allows that, or maybe use ready made DMX equipment.
My bet would be 100:1 that 15 chained switches won't work at all, but I'm no network engineer, so take those 4 max chained switches as an uncertain info. Will post a snapshot of that book page, but I don't promise I can still find the book.
-
I don't know any reason there would be a limit at all for switches, rather than hubs.
If someone knows, please explain.
-
I vaguely remember a number of 4 maximum chained switches, from a very thick book about networks and Windows NT servers, back in the 90's. Had to do something with propagation and delays/lag. Those were certainly not Gbit switches, I guess 100Mbit, if not 10, can't say. I might still have the book somewhere.
For 100+ nodes I would rather opt for serial over RS-485 if the application allows that, or maybe use ready made DMX equipment.
My bet would be 100:1 that 15 chained switches won't work at all, but I'm no network engineer, so take those 4 max chained switches as an uncertain info. Will post a snapshot of that book page, but I don't promise I can still find the book.
I know there are limits on hubs due to collision detection, but those aren't really a thingmain thing any mode. AIUI switches are little different from normal nodes. I suspect the main issue is how they behave when congestion starts to occur
-
Indeed, I've tried to reconstruct the reason for why it wouldn't work, and can not come up with any explanation (particularly for UDP). The limit of 4 might have been about hubs, sorry. Please ignore my previous post.
-
The only limit that springs to mind would be the TTL (time-to-live) field in the Ethernet header.
-
Is there anything intrinsic to ethernet preventing a series string of, say, 15 switches working reliably ?
Technically, I see no reason why it should not work with switched Ethernet. Essentially, this is what is done in PROFINET or EtherCAT line topologies in industrial automation (they are often built as a ring, but this ring is opened at one hop by using some kind of spanning tree protocol). Spanning tree protocols themself might impose a hop limit, but perhaps you would not want to use them in your scenario anyway. Keep in mind that the latency increases with the line length, since each switch must receive a packet on an ingress port completely (store-and-forward switching) or partially (header, for cut-through switching) before it can forward it to an egress port.
-
So considering our company network core, there can be, in a worst case, 8 Hops between two clients in the same layer 2 network:
Unmanaged small switch -> Edge Switch -> Aggregation Switch -> "Outer" Core Switch -> "Inner" Core Switch and the reverse if the other client is in another building on our campus.
The number of clients in the same subnet is more important. I would strongly suggest to keep your subnets at most at /24, so at most 253 Clients in the same network segment.
Also, small switches often have limited MAC caches, getting into trouble if too many clients are connected.
-
There was a limit for the spanning tree protocol (SPT), too, which AFAIK it is used by the switches to learn then find the path to each MAC. Again, I'm talking about things I don't master and I'm not sure if that includes loops or not. There is a limit caused by SPT at 7 chained switches. https://community.cisco.com/t5/switching/802-1d-spanning-tree-7-hops-limitation/td-p/763269
Not sure if SPT is the same as RSTP (Rapid STP), which might be more widespread nowadays.
-
Also, small switches often have limited MAC caches, getting into trouble if too many clients are connected.
That limited MAC table in small switches might be a problem, too, now I recall bumping into that once, during 2010 or so.
On a second thought, if you use consumer grade equipment, then why not just using routers instead of switches (If the application allows). Consumer grade routers are not that expensive, and they should have no problem when chained.
-
If they are pure MAC switches then I can't see a limitation - you've excluded performance issues so it should be okay.
If the switches have any intelligence you'll want to make sure they handle deep subnets - I know some consumer switches balk at anything more than /24 routing.
-
The only limit that springs to mind would be the TTL (time-to-live) field in the Ethernet header.
Isn't that an IP thing, as opposed to Ethernet ?
-
Is there anything intrinsic to ethernet preventing a series string of, say, 15 switches working reliably ?
Technically, I see no reason why it should not work with switched Ethernet. Essentially, this is what is done in PROFINET or EtherCAT line topologies in industrial automation (they are often built as a ring, but this ring is opened at one hop by using some kind of spanning tree protocol). Spanning tree protocols themself might impose a hop limit, but perhaps you would not want to use them in your scenario anyway. Keep in mind that the latency increases with the line length, since each switch must receive a packet on an ingress port completely (store-and-forward switching) or partially (header, for cut-through switching) before it can forward it to an egress port.
We don't care at all about latency, though appreciate it may be factor in optimising overall throughput.
-
Nah, switches all "store and forward". They don't care where it has come from or where it is going to unless they watch the MAC addresses. My experience with small switches is if you overflow the MAC table they just broadcast. Messy but still works.
-
Also, small switches often have limited MAC caches, getting into trouble if too many clients are connected.
That limited MAC table in small switches might be a problem, too, now I recall bumping into that once, during 2010 or so.
On a second thought, if you use consumer grade equipment, then why not just using routers instead of switches (If the application allows). Consumer grade routers are not that expensive, and they should have no problem when chained.
The switch chip I'm looking at has 1K MAC storage, so plenty.
We're integrating the switch into our own hardware, so can potentially tweak parameters, most of which I don't yet understand...
BTW nodes will never need to talk to other nodes ( e.g. through the top gbit switch), it's only ever host <->node, so MAC table size shouldn't be an issue
-
I have seen something similar used in voip phones, but there it is only intended to have a single ethernet cable to a desk and connect the computer to the phone output.
When I still had my business, I had a phone system in my house and used grandstream voip phones for it. After a thunderstorm some died on me and I opened one up, to discover a RTL8305SC 5 port switch IC with only three of the ports used. Two RJ45 ports on the back and one internally hard wired to a RTL8019AS ehternet controller IC. Still have it in a box for future use, if ever. :-DD
Never tried hooking them up in a series chain, but do not see a reason why it would not work. For as far as I know the switch will pass through all data that does not match the mac address of the internally connected port.
For a long chain, the last device will have to wait for its data to pass through the most, but as you wrote that is not of much importance for your application.
If you like and I can find the power adapters for the phones, I can test it with maybe up to 8 of them to see if data comes through to the last device. (Not sure how many still working ones I have.)
-
Spanning tree / rapid spanning tree protocols are used for loop detection. You only need them if you think it's likely someone might connect the ends of two chains together and you need to continue operating in that situation. If so, you will need to enable (R)STP and check it's configuration parameters for the maximum diameter. You would also want to configure to make sure your gigabit switch is the root of the tree. But chances are you don't need STP.
In your application MAC tables filling up is only really a problem for your root switch. If the table fills up then it will start broadcasting into each of the 100 megabit chains. That could lower your system bandwidth to only 100 megabit.
In the chains it doesn't really matter since the total chain can only use 100 megabit anyway.
The main weakness of a system like this is that a fault on and node disables all the downstream nodes. If the switches are integrated into the nodes you can't power cycle or replace a node without breaking the chain. If that's something you can work around it should work fine.
-
Spanning tree / rapid spanning tree protocols are used for loop detection. You only need them if you think it's likely someone might connect the ends of two chains together and you need to continue operating in that situation. If so, you will need to enable (R)STP and check it's configuration parameters for the maximum diameter. You would also want to configure to make sure your gigabit switch is the root of the tree. But chances are you don't need STP.
In your application MAC tables filling up is only really a problem for your root switch. If the table fills up then it will start broadcasting into each of the 100 megabit chains. That could lower your system bandwidth to only 100 megabit.
In the chains it doesn't really matter since the total chain can only use 100 megabit anyway.
The main weakness of a system like this is that a fault on and node disables all the downstream nodes. If the switches are integrated into the nodes you can't power cycle or replace a node without breaking the chain. If that's something you can work around it should work fine.
I doubt 200-odd MACs would be an issue for most gbit switches, and we don't care if we need a fancier one as it's a tiny part of the system cost.
We're now probably looking at having the switches on a backplane that 3-4 nodes plug into, which both avoids the issue of powering nodes down, an reduces the number of switches by a factor of 3-4.
Next step - see how far I can abuse the PHYs to minimise magnetics - pretty sure I can have one transformer between the node and the switch, maybe none at all - they're are all on the same supply, and only about 100mm away from each other...
-
There's no simple answer. It mainly depends on protocols used and timing constraints of the ethernet protocol itself and switch implementations. Some already mentioned STP which has a limit of 7 switches stacked. And it's not recommended to go much further, even with STP disabled. One solution would be to create groups with max 7 nodes stacked (including the central switch). Or you could use a lightweight routing protocol and run all nodes as routers.
-
Also since you are using UDP it's important that your communication is bidirectional. It's the replies that let the root switch learn the MAC table mappings. Without this it will have to broadcast everything and limit system throughput
The initial ARP lookup is sufficient to fill the table, but if you don't get regular replies it eventually times out.
-
I doubt 200-odd MACs would be an issue for most gbit switches, and we don't care if we need a fancier one as it's a tiny part of the system cost.
Yeah, with 200 MACs you should be totally fine.
We're now probably looking at having the switches on a backplane that 3-4 nodes plug into, which both avoids the issue of powering nodes down, an reduces the number of switches by a factor of 3-4.
Next step - see how far I can abuse the PHYs to minimise magnetics - pretty sure I can have one transformer between the node and the switch, maybe none at all - they're are all on the same supply, and only about 100mm away from each other...
I've used transformerless ethernet with no problem on a single PCB at distances significantly greater than 100 mm. I think it should work fine with a backplane connector as long as you have a solid ground connection and controlled impedance traces.
-
Next step - see how far I can abuse the PHYs to minimise magnetics - pretty sure I can have one transformer between the node and the switch, maybe none at all - they're are all on the same supply, and only about 100mm away from each other...
The grandstream phones I wrote about use magnetics between the switch and the internal port. The one that had problems after a thunderstorm still works processor wise, and I tried to remove the switch from the chain to see if it was willing to communicate, but unfortunately it did not. Maybe due to the crappy wiring without twists or maybe the RTL8019 chip is also gone. Have not played with it after that first test and it has been in its box for more then 10 years.
The idea of a back plane where more modules are connected to a single switch makes a lot of sense. With the switch having 5 ports you need 50 of them when you use 4 ports for the nodes and 1 port for the connection to the main switch, but with a 48 port main switch you fall a bit short on nodes.
Not cheap, but maybe an eight port switch ic might solve some of the issues. https://eu.mouser.com/datasheet/2/268/ks8997-3443654.pdf (https://eu.mouser.com/datasheet/2/268/ks8997-3443654.pdf)
With 7 nodes per switch it reduces the count and with a 24 port main switch you can handle 168 nodes.
-
The only limit that springs to mind would be the TTL (time-to-live) field in the Ethernet header.
Isn't that an IP thing, as opposed to Ethernet ?
Quite possibly. It's been a while!
-
Not cheap, but maybe an eight port switch ic might solve some of the issues. https://eu.mouser.com/datasheet/2/268/ks8997-3443654.pdf (https://eu.mouser.com/datasheet/2/268/ks8997-3443654.pdf)
With 7 nodes per switch it reduces the count and with a 24 port main switch you can handle 168 nodes.
Cheap 8 port switch :
https://www.lcsc.com/product-detail/span-style-background-color-ff0-Ethernet-span-Switches_IC-Plus-IP178G_C703544.html (https://www.lcsc.com/product-detail/span-style-background-color-ff0-Ethernet-span-Switches_IC-Plus-IP178G_C703544.html)
The number of modes per switch will probably end up coming from mechanical/board size constraints - the nodes are 4" wide, and I'm looking at doing thin PCB strips as a backplane, so 3-4 nodes per switch to keep PCB length sensible, though a 2-row layout with 6 per backplane is a possibility
-
No issue with Ethernet switches in series. They are store-and-forward devices. Presuming they will be layer 2, so all they care about is the MAC addresses and ARPs.
A UDP broadcast will ripple down the chain, or up and down chain from a device in the middle. Handily it will be handled by the switch hardware with no CPU involvement, though the switch will forward the packet onto your processor as it's a broadcast.
-
how about something nuts, a leaky coax and wifi?
-
I'm looking at new project which will have a large number (100-200) of nodes ( probably based on RasPi compute module) that need to be linked with ethernet.
For various reasons, it's not practical to have each one going to a switch, we ideally want them in a number of series chains.
I'm looking at each node having a switch chip integrated, e.g.
https://www.lcsc.com/product-detail/span-style-background-color-ff0-Ethernet-span-Switches_IC-Plus-IP175G_C80220.html (https://www.lcsc.com/product-detail/span-style-background-color-ff0-Ethernet-span-Switches_IC-Plus-IP175G_C80220.html)
(as it's the cheapest documented switch chip I could immediately find)
This onboard switch would have 3 ports - one to the node, and an in & out connecting to adjacent nodes. Possible more to allow branching.
As others mentioned: a standard (store and forward) ethernet switch doesn't have a real limit beyond the MAC learning table.
About the IP175G: this chip exists in many incarnations where each new version seems to be a die-shrink and has some new features. I have used the C version a long time ago and looked into using F/L versions but didn't go through with it due to chip shortage. The only IP175 I managed to find where desoldered ones. I strongly recommend contacting an IC-plus distributor for additional information. There are some interesting non-public appnotes available (including EMC compliance). I also recall the power consumption is rather high so a heatsink might be needed. One of the interesting features is that the MAC on this chip works down to DC so you can bit-bang the MAC interface from a microcontroller. Or use an FPGA without being picky about the frequency used for the MAC interface. The IP175 takes care of all the clock domain crossing & buffering issues.
I ended up using the LAN9303 from Microchip. This chip has 2 integrated phys and 1 MAC so one of the ports will need to interface with a MAC and the MAC interface on the LAN9303 needs to see the right frequency otherwise it operates in a different mode.
-
[old guy mode=enabled]
I worked at DEC in the mid 80's up to the early 90's. networking was my thing. 10base5 was a thing and 'thickwire' with vampire taps was a thing ;)
anyway, iirc, there was a limit to the number of repeaters and then bridges. there were no 'switches' back then; it was called a bridge and you'd be lucky if it came close to wire-speed (even at 10mb, I dont think the dec gear could run at 100% link speed).
there were 3 repeaters in series and then no more. unless the next was a bridge and then that 'reset' the repeater count. then there was the question about how many bridges. I think the number at the time was 7. and you could have some repeaters between those bridges (max of 3, again).
for a long time, these were the layer1 and layer2 rules. had to do with round trip time (ttl) and end to end delays. the 'jam signal' had to be seen by all and that was really only for true broadcast domains (layer1 repeated hubs). today you cant even find a 'hub' if you try; everything, even $5 multiport boxes are switches. (I used to like having real hubs as you could do port sniffing easier; now we have port-mirroring but its still NOT the same thing).
for today, I do think the limits are nowhere the same but there are still limits. things will break if you slow extend latency and transit delay. they'll work until they dont. (who's law is that?)
time for the old man to nap. later...
-
switches are not always store-forward
there's also 'cut-thru' and you may not be able to know which is which.
smaller consumer boxes are often bridges, so that means s-f.
but high density data center switches can also be cut-thru.
these days, there might even be strange hybrids of the two.
networking companies have run out of ideas (about 30 years ago) so they all try to break^Hchange things in special ways.
but 'switch' can mean many things.
-
[old guy mode=enabled]
I worked at DEC in the mid 80's up to the early 90's. networking was my thing. 10base5 was a thing and 'thickwire' with vampire taps was a thing ;)
Old ... pffft.
DEC was my dream when I was at university, using the 11/34, 11/70, 11/780.
But then I got a job and they had a DG Eclipse MV/10000 and I used those for a decade, ending on MV/2000. Actually, the job started as a summer holiday job. The first task was to decide what compiler my employer should buy -- and what database -- as they'd never had an in-house programmer. I spent the first two weeks (in December 1984) in the local Data General office playing with the COBOL, FORTRAN, PL/I compilers and DG/SQL and some non-relational database. I advised my summer employer to get the PL/I and the SQL -- which was already quite advanced, with precompiled queries, proper use of PL/I variables in the compiled query (not just textual interpolation like so many things do today), and also proper referential integrity.
Data General actually offered me a job at the end of that two weeks, but I was already committed to the financial company I was evaluating the stuff for.
I'd previously spent the summer two years earlier (82/3) doing COBOL on a Pr1me as the only in-house programmer in a small city council ... somewhat under the wing of programmers at the bureau that owned the Pr1me. That was at the end of my 2nd year at university. I did 9-5 programming at the city council and then worked around 6 pm to midnight picking up hay bales (by hand, on to a truck) and carting them to the barn. Oh, the endless energy at 20 years old!
So, yeah ... you can't possibly be old, because I'm not.
-
switches are not always store-forward
there's also 'cut-thru' and you may not be able to know which is which.
Several people have mentioned "store-and-forward" but it's not relevant in this question at all.
Store and forward means that the switch waits for the entire packet before it beings transmitting. Cut-through switches sometimes (but not always) begin transmitting before the packet is completely received, shortly after the headers are received. Other than shaving a couple microseconds of latency, it doesn't really affect how it works at a network level.
-
Is there anything intrinsic to ethernet preventing a series string of, say, 15 switches working reliably?
No problem at all in theory. Just the loss of reliability from added failure points.
I've seen various device to device networks with realtime audio or video (so dedicated bandwidth, and not over shared paths) happily work with sustained high bandwidth and low latency demands in 6-8 deep layers of switches. From what you've described it should be no problem going for dozens of hops.
Also recall you can get a handle on this from internet traceroute. I'm already 6 hops to leave the physical site, and ending at a total of 15-20 hops to some website endpoints. Loading a website across that still works fine.
-
Also recall you can get a handle on this from internet traceroute. I'm already 6 hops to leave the physical site, and ending at a total of 15-20 hops to some website endpoints. Loading a website across that still works fine.
That's IP, not Ethernet. But the links between the routers are often ethernet.
-
Store and forward means that the switch waits for the entire packet before it beings transmitting. Cut-through switches sometimes (but not always) begin transmitting before the packet is completely received, shortly after the headers are received.
They will fall back to store-and-forward if, for example, the link speed of the output port is lower then the link speed of the input port.
Other than shaving a couple microseconds of latency, it doesn't really affect how it works at a network level.
One drawback is the forwarding of bad frames. Only in store-and-forward mode the switch can check the FCS at the end of the frame and discard the frame if it's bad. In most cases it doesn't matter much.
-
They will fall back to store-and-forward if, for example, the link speed of the output port is lower then the link speed of the input port.
With lower you could start transmitting right away and buffer the excess in a FIFO.
Higher is the tricky one ;)
In reality, I'm not familiar with such gear, but I wouldn't be surprised if they don't bother handling either.
-
I also don't think there is a hard limit, but YMMV eventually.
That said, when you look at other case studies where that kind of scale is reached readily, such as office buildings they often start out as a random collection of desktop/consumer grade switches across the floors and only gets to pro-level switching in the building core.
The company I started my career with fell into this state. They had 3 large floors, each floor did have a Cisco switch in the "core cabinet" where the vertical cabling went. However, the "glands" in the floor where cables passed down to power extensions and network switches typically had little "Net Gear" style switches.
It turned out when they finally got the worst case switching loop that created a broadcast storm disabling half the network, there were something like 60 switches in the building. With the majority of them being under the control of the dev teams and office workers. All it had taken was someone to connect a switch in a loop and without any fancy spanning tree and anti-looping it caused a cascade failure.
The solution was to rewire the floors so that ALL ethernets terminate back in the locked cabinet into 48 port cisco switches.
-
OK so looks like the answer is basically no.
The architecture has changed a bit now some more thought has been applied.
Now looking likely to be a few strings of maybe six to ten 8-port switches, each feeding 6 nodes. As we have control of the switch design there is a scope for tweaking any of its parameters that might be useful to optimise things - still need to investigate this.
There will likely be a low-speed side-channel, probably RS485, linking our switch boards for various reasons, so we can remotely play with switch configs easily.
Thanks for the input everyone.
-
For a bit of random. In one cluster I worked on we used optical splitters and duplicated MAC addresses to distribute data to nodes. Layer 1 baby.
You might be surprised to find that 16/32/64 optical splitters are readily available. Mostly passive too.
-
I also don't think there is a hard limit, but YMMV eventually.
AFAIK, only the classic STP has a hard limit. Modern versions like MSTP don't, but it depends on the specific implementation. There are also proprietary modifications (-> vendor lock-in). An important point for larger networks (applies also to ethernet with any sort of STP) is convergence time. It will increase with complexity, depth, and number of network elements. Exceeding some network size the whole network can become unstable, i.e. a small change can cause havoc. The typical solution is to partition a large network into smaller ones.
It turned out when they finally got the worst case switching loop that created a broadcast storm disabling half the network, there were something like 60 switches in the building. With the majority of them being under the control of the dev teams and office workers. All it had taken was someone to connect a switch in a loop and without any fancy spanning tree and anti-looping it caused a cascade failure.
A classic! ;D
-
it does. cut thru can simply drop frames. see, layer 2 is not guaranteed delivery no matter what anyone tells you. there used to be llc2 and llc3 but again that's ancient. like x.25 ancient ;)
everything 'south' of the tcp layer can drop frames. hell, even udp can drop frames. its parent (whatever sits on udp at the time) has to do timeout and retries. if you sit on top of tcp, tcp does all that. but tcp does a LOT of work since it can assume layer2 and ip (3) can both drop pdu's and be allowed to, if it comes down to it.
if you can avoid dropping at lower layers, the upper layers have a 'smoother time'.
so yes, it kind of does matter if you store/forward or drop and let 'higher ups' resend.
-
also, some newcomers (like avc bridging, aka, 'time sensitive networking or TSN) will have to deal with special hardware to get the guaranteed delivery *timing* (not just getting every frame there but getting them 'on schedule') will make things even worse or harder for network engineers.
in car networking, TSN is getting to be more and more of a thing. with the hope of having level3-5 'self drive' in the future, you need the next level of redundant and time-sensitive networking. in those cases, you dont just go adding more network nodes or switch/routers.
https://en.wikipedia.org/wiki/Time-Sensitive_Networking
for home networking, which is probably what the orig question was about, I still would not cross more than, say, 3 bridges or switches unless you really have to. and 3 layers should be more tree hier than anyone at home would need (3 layers of roll-up in the tree hier). at some point, you break things into subnets and route them, and btw, 'routers reset the count' (lol).
-
lol - a still unopened bag of 50ohm coax 'thinwire' (10base2) terminators.
(OT: it really *was* the coolest place to work. you put in a purchase req for anything in the DECdirect catalog and it just got expensed from one cost center to another. DEC was its own world, we even had our own telephone numbers called DTN (dec telephone network). I dont think there is anything like that, company wise, left anymore. we had our own set of satellites linking US, europe and what we called GIA for 'general international area')