As with everything show related, timing is everything.
Let me use an analogy to the lighting world:
I know that when Ethernet DMX came out, there was a lot of fussing before hand about how to assure that the packets arrived on time, and in sequence, as TCP/IP/UDP can and will send them out of order when collisions occure.
The end result was: Yes, EthernetDMX works fantastic. It is fast enough that even with dropped packets comming in last, you didnt notice it much… So long as you 1: Have a dedicated network, and 2: only have one 'Host." I.e the DMX dimmers acted in a way like slaves, in that they would not generate their own traffic without requests. So, having one PC (managing the network), one or two ‘Desk consoles’ and a bunch of dimmers, and everything is perfect. Change things around, add more desktops.. and latency started to become apparent.
At least, on the DMXnet I worked with that was the case. Obviously replacing hubs with switches fixes just about all the problems. Again though, cant stress enough how important it is to keep it a dedicated network. As soon as lil’ Billy goes browsing for porn, the lights start acting up.
How that translates to MIDI over Ethernet, I dont know. Scale is different. Three or four MIDI devices is not like 400+ lights (each a single ethernet node). However, lights only need one parameter (in the case of intelligent lights, no more than 16 usually) while MIDI can transfer a lot more information than just one byte to a single device.
I would also assume you are not attempting to download the latest Jenna Jameson video while making music 
Honestly, Ethernet DMX is a LOT of overhead for very little information. Remember that TCP has a minimum packet size. a single DMX channel only needs a single byte to describe its setting. A bank dimmer may have 8 or even 16 lamps on it, but still, at 16 bytes, there is still a lot of padded zeros in an EthernetDMX packet. Waste of overhead and processing power in my opinion. The only thing it solves is the ‘world’ system in which a DMX line is limited to 512 devices. An intelligent light might take up several ‘devices’. It is easy to see that no more than 16 smart lights can exist on a dmx512 system. In large spaces, you might have to run several DMX systems if you have a lot of intelligent lighting. Thus ethernet comes into play in large arenas, since it is virtually unlimited as to how many devices can play. Not to mention, if one dimmer dies, the whole world doesn’t fail (as is the case with DMX).
Again, how that translates to MIDI… well really it is just a lot of overhead and wasted power on relatively small timed packets of data. The logical progression to me is more along the lines of USB. Much less overhead, still lots of device room. Simple interconnect. Simple protocol.
I say leave Ethernet to what it is designed for: Connecting desktops to desktops .. transfering large chunks of data quickly, but without regard to packet order.
Serial to USB adaptors are incredibly cheap, and offer the data from the serial port side to the PC as if it were a REAL serial port. Essentially NO extra programming needs to be done, nor is there much overhead. A simple driver install fools every piece of software you have into thinking it is talking to a serial port.