For the original SGM 2015 music server our primary focus was on reducing DAC filter quality influence by providing a higher quality (up sampled) data stream. Especially DSD DAC’s could greatly benefit from this method and be provided with a means to convert every source format to DSD. Of course the end result would depend on how the DAC processes higher data rates, most DAC’s use different filters for different sampling rates, or they can be user selected, so you would force the DAC to use, or be able to use a different filter this way. Some of the filters and up sampling algorithms provided by HQPlayer which we use for that purpose are so processor intensive you can most definitely not run it on for example a Roon Nucleus, let alone they will run in a DAC’s FPGA. A good example it the Chord Dave which is all about high quality filters and boasts a very strong FPGA array but it cannot approach the filter algorithms quality HQPlayer can provide.Fast forward to this day and age where DAC quality, and DAC filter quality technology has advanced significantly, especially in upper echelon R2R DAC’s, we find ourselves in the situation that the benefit of pre-processing the data has either decreased or is gone altogether. Most of these simply sound best being fed native “bit perfect” data rates. “Low latency” is something computer audiophiles have been hunting since the early Logitech transporter/squeezebox days. We don’t think anybody ever really knew why it sounded better, it’s sought after in the studio recording scene, but obviously to avoid time related sync issues and audio stream interruptions, but not sound quality afaik. There are several tools available to measure a system’s latency to this purpose (referred to as DPC latency, ISR routine execution time, interrupt to process latency etc). It’s generally accepted that lower latency sounds better amongst computer audiophiles though.Lowering latencies reduce active processing times. You can view latency as a roadblock that you cannot pass until its removed. Or shifting your transmission into a gear before you can accelerate. During a latency “wait state”, a processor, memory module or system bus data path is getting ready to accept data packets. It will be active though, drawing current, transmitting its unavoidable EMI and or RFI spectrum which any electrical component will do. With lower latencies we reduce overall system current draw, EMI, RFI and processing durations. Contrary to what you would expect, you can have lower current draw variations and net overall lower EMI / RFI emissions from higher processing power solutions being minimally loaded then from low processing power solutions being higher loaded. The general view that lower power servers generate less noise then higher power servers BECAUSE they consume less current is wrong in our experience. Of course there are a lot of variables in this equation as it’s easier and cheaper to design a low noise power supply for lower current draw requirements so you may very well get better results from a low power solution, especially when using the same power supply. But this is not the design goal of the Extreme.

Now if we can agree on the hypothesis that introducing any type of component into your system can alter your system sound, not restricted to the signal it puts out, it will become a lot easier to explain more differences. No matter if it’s a server, a cd transport, an amplifier, a cable, a fuse, a grounding or ground modulating device. In fact why don’t we extend the definition of your system to your phone charger, your imac, your WIFI router or your new refrigerator?

The attached picture shows how incredible ethernet error correction works. It’s been designed to deal with distortion and noise associated with up to 100 meters of unshielded copper cable. (For general interest, shielding mainly affects NEXT). There’s also additional error detection / handling on a different layer where unrecoverable data is being retransmitted. However incredible all this does significantly increase ethernet PHY power consumption. And arguably we don’t need it’s full capabilities for short connections but it is what it is.

Moving on to Ethernet enabled DAC’s. You will actually have a complete computing environment running an operating system, cpu, memory and a hardware and software networking stack, endpoint software like a upnp renderer, or Roon endpoint software. This is quite a bit more then a USB or AES/EBU receiver. It will most definitely draw more power and have higher EMI/RFI emissions.

But, from a usability point of view, it’s absolutely great as it allows you to stream directly to your DAC from any other networking connected streaming source. However, getting it to sound optimal requires quite a bit of tinkering in your networking environment. The often used selling point it nullifies the influence of your source is definitely not true. And as of yet we are unconvinced it’s the way to go to obtain maximum sound quality.

1) Short and Easy:

SSD storage is a bottleneck, the Extreme stores music on PCIe storage removing this bottleneck. PCIe storage is atleast 4 times faster and lower latency then SSD storage resulting in much lower noise. Both are internal flash storage, but the interface is different.

2) More in depth:

A SSD is a SATA storage device, it is controlled by a SATA controller running of the motherboard chipset. The motherboard chipset typically handles all communication, storage, networking, USB etc. The motherboard chipset sends/receives data to the CPU over a bus called DMI, Direct Media Interface. The modern DMI still only has 25% of the bandwidth of a single PCIe slot. This is a well known bottleneck, just google “The DMI Bottleneck” if you want to know more. PCIe storage bypasses the motherboard chipset and directly connects to the CPU, relieving the motherboard chipset of having to time share USB, networking and storage read/write communication. To make matters worse, consumer grade CPU’s have a very limited number of communication (PCI) lanes available. Therefor to provide multiple PCIe expansion slots these are usually just multiplexed on to the motherboard chipset increasing the bottleneck.

This is (one of the reasons) why we use large Xeon CPU’s as these have a lot more PCI lanes. The PCIe expansion slots are directly connected to the CPU’s and do in fact run at full speed. Today’s Xeon CPU’s also have integrated disk controllers and can drive 12 PCI storage drives directly at maximum speed (over 48 lanes) and extremely low latency. We also have 2 CPU’s which doubles the amount of lanes and disk controllers, meaning we can access 24 drives at full speed simultaneously. All this greatly reduces processing active times on the motherboard, and with greatly think 1000s of times.

In the SGM2015 and EVO we used very expensive and sensitive OCXO clocks, so we have plenty experience with this technology. However, during the development of the Extreme we discovered a way around the problem that is seemingly fixed by an OCXO. In appliances outside of the Extreme, these clocks will still make a positive difference. Our proprietary solution surpasses the effect of all available OCXOs we know. The results we are getting from OCXO upgrades versus our “clock-less” technology is it moves the sound more to a CD playback signature rather then towards a Vinyl playback signature.

The hole pattern ventilation slots visible on the photos of the Extreme is not there for looks, it has a very large effect on 3) (and a bit on 2) as a bonus). These holes are “waveguides” which attenuate emissions by 81 dB which is around 10.000 times.

We design and build our servers to last. Every component is subject to wear due to exposure to vibration and heat. We design for vibration resilience and low heat operation. Take a CPU for example, it can operate 24/7 at 70 degrees Celsius (160 Fahrenheit), but its performance may start degrading after just 2 years and completely fail after 5. You can keep it running cool by using fans, but fans have a vibration signature and create “low” frequency electrical noise which is quite harmful to midrange integrity. Furthermore, fans degrade fairly rapidly leading to increased vibration and noise. There are several after market passive cooling solutions available, but they have limited cooling performance, they work just fine initially, but performance will degrade faster then desired for our purposes. Do note this is not a problem in the DIY world where people tend to swap their hardware components frequently. None of this is acceptable for us so we design our own passive cooling solutions. This comes with its own challenges, interfacing to the CPU for example, our CPU coolers are CNC machined to a 5 micron tolerance, that is 0.005mm, note that a 100 micron (0.1mm) tolerance is already considered to be very good for CNC machining. We also use heatsinks machined from solid copper as it cools twice as good (fast) as aluminium. This increases life expectancy by at least 4 and up to 12 times over other solutions. The resulting low operation temperatures also increases sound quality. Our CPU’s operate at between 35-50 degrees Celsius (95-120 F) depending on environment and load. This also means audio performance will persist over time. A big upside to all this is if we have a component failure, it will be early on, either in the initial stress test in our factory, or in the first few weeks of usage.

The noise signature of OS has a huge influence over the sound that comes out from the system. Windows LTSC is the clear leader in our testing, and the gap between the latest from Microsoft and the different flavors of Linux can only be expected to increase in the future. When Microsoft delivers a better kernal and scheduler, we for sure will build an updated OS from these new components. In effect, the OS is absolutely future obsolescence proof for many years to come


Your network setup does influence sound, every component connected to it has an effect, even your mobile phone using your wi-fi in a way totally unrelated to streaming introduces activity on all network ports and cables. This includes browsing your favourite Audio Forum. There is a way to reduce this with smart switches/routers, or by using VLANs to segment your network, but this is advanced networking, not typically used in domestic situations.

All network activity causes noise, every data packet travelling your domestic network introduces electrical activity travelling your entire network which is just 1 “subnet”.

This also means your music sever will “see” all data packets travelling your network, it will “investigate” every packet to check if it contains data adressed to it.

A domestic switch will simply replicate all data on it’s input to all it’s outputs, a smart switch provides you with a degree of control over this, so you can segment your network, reducing network traffic on specific links. Again this is advanced networking, none of the “audiophile” switches support this. Now before you think “gotta have”, smart switches apply processing to the data stream, investigate certain parts of all data packets passing through, use more power, and have a noise signature. So there are pluses and minuses to using this in the first place. Network utilisation and the amount of active devices in your network are going to be determining factors if this can net out positive or not.

100Mbit networking uses 2 differential data pairs, 1Gbit networking uses 4. Data is transported as a modulated voltage over these lines. Modulating voltage introduces certain types of noise. In a switch the ports are galvanically decoupled by means of a differential transformer of which the center tap is connected to ground, usually through a simple filter network. One of the functions of this is to break “ground current paths”.

Moving to fiber, we have optical links, SFP modules convert electrical signals to light pulses and vice versa. So arguable there is no real benefit to reducing network activity induced noise, in fact there is additional activity inside your appliances from this conversion process. A SFP module can easily consume 1 to 1.5 watts of power, which does not seem like much, but at this level, it is a lot. On the plus side there is no path for ground currents or electrical noise, whatever the source, travelling your fiber links. There are many types and makes of SFP modules, obvious differences can be found in power consumption efficiency, robustness, error correction, quality of optical receivers/transceivers etc. SFP+ (10G) modules can apply a higher degree of error correction, some even have built in “reclocking or jitter reduction” and yes this draws more power, so positives and negatives. Industrial versions are built to operate in harsh environments, like abnormal temperatures, heavy vibration environments, or in strong RFI/EMI polluted areas. What you can get buying industrial grade is better component quality and tolerance, higher selection grades, more robust PCB mounting and/or layout, better error correction algorithms, better filtering and most of the time lower power consumption. The downside is hefty price tags. We do have a few here.

Your internet router performs quite a bit of processing, it almost always performs something called NAT (Network Address Translation) meaning it forwards traffic from the internet to a different Ip range which you use inside your home. There is both a security and a functional aspect to it as without it each of your devices would require an unique IP address on the whole world wide web, and there is a limit to addresses available, that is why we are for example moving from the IPv4 protocol to IPv6 which has a vastly higher number of IP addresses available. A security aspect is your device cannot directly be accessed from any other device in the world. The router usually also provides DHCP services (assigns an unique address to each device on your local network), can provide DNS caching and often runs firewall software. It also often provides Wi-Fi services. It can be quite a busy device.

By now it must be clear that this is a very complex system with a lot of variables in play. Every network is likely to be unique. Different routers, different switches, different devices using it, different traffic patterns, it is unlikely that there are 2 exact identically performing network setups anywhere in the world at any given time.

Now how does all of this influence playback quality of the Extreme? Well it does, no way around it. So what we have done is running a whole lot of different network setups and combinations to identify the largest disturbances to sound quality. You can take measures to minimize their influence and get repeatable results up to a degree.

The copper network port of the Extreme will provide you with good and repeatable sound quality in virtually all environments. It will sound largely similar in all environments, even in the presence of heavy RFI/EMI pollution.

The fiber network port of the Extreme provides a somewhat different perspective, it will not very significantly impact the overall sound quality or voicing. The plus side is a certain degree of “isolation”. The down side is additional processing and a slightly higher power consumption.
It tends to net out positive with for example blacker backgrounds, improved clarity and more focus without impacting voicing. The downside of the increased focus is “sharper edges” to images, and some SFP modules can introduce a degree of mechanical quality to the sound, the reclocking SFP+ modules being about the worst at that.

So we have recommendations we make, based on repeatable results in different environments, the recommended SFP modules and FMC are based on that. There are combinations which give an impression of higher resolution but it’s important to note increased noise is often perceived as increased resolution. The fatiguing aspect of this usually goes unnoticed as comparative listening sessions are often of short duration with a few test tracks people skip through quickly to remember enough detail to make a meaningful A/B comparison. It is rarely evaluated long term, being over weeks, listening in different moods/mindsets, at different levels of physical or mental fatigue, at different times of the day with varying levels of power grid pollution, or how do you perceive the difference in the first 30 minutes, and then after a few hours of continuous listening. There are again a lot of variations to evaluating.

Therefore our recommendation is to just use copper networking initially, let the Extreme burn in / settle in your environment, so far it has performed to full satisfaction by everybody who has bought one using it this way. Apply basic voicing measures as you would do with any appliance, like powercords, usb cables, footers etc, to adjust it to your taste. Then, if you feel so inclined turn to tweaking your network environment. And don’t take anything for granted there, as your results are not guaranteed to mirror others.

It is relatively risk free to jump straight to using fiber, when used with the components we have tested long term, in various environments, but it is really optional. It is a relatively minor investment with value for money gains though. The downside is it has a “manual”, if you power cycle the server you sometimes have to power cycle the FMC too, or pull the copper network cable from it, so it generates a link fault resetting the interface. But that is really quite a minor issue.

Now keep in mind, network tweaking and audiophile networking products are relatively new, surely there are gains to be made there. But do be aware of all aspects of performance.

Wireless networking does negatively impact your hifi system. It is airborne high frequency noise.

This is partially why we use a hole pattern in stead of slots for ventilation, it shields from high frequency noise.

This is a 2 way street, the Extreme is pretty well shielded from outside rfi noise, though not immune, almost nothing is unless being explicitly designed to be that. But it is much more immune to RFI then for example an Intel NUC.

Now take a good look at chassis openings in your other hifi equipment and consider cabling, rarely completely shielded, your in wall power wiring etc.

Wireless networking transmissions are going to find their way into your system. It’s just part of living in this age and time.

If you can accept it’s part of our lives now, and it provides services you are more than happy to use, why object to using something like a wireless extender, or even just use wireless networking as your main infrastructure? Especially when you have even stronger noise sources to deal with, which can make this a really minor issue.

If you’re asking what in our experience least impacts sound quality in networking, wireless is going to be at the bottom of our list. Not just by how it works, not even because the networking stack is more complex on the software side, as that does not come into play using extenders, however we completely removed all wireless support from the Operating System on the Extreme on purpose. But just because it sprays the system with RF.

Unfortunately we cannot make foolproof recommendations here as the network setups / environments are different everywhere. Similar to a powercord, interconnect or a footer not having the same effect in every system.

Some examples:
-Most people do prefer using the fiber network input of the Extreme with the Startech components recommended by us. But if you use a 2 meter fiber optic cable to a SMPS powered FMC which plugs into the same circuit, you will likely not be happy. If you have a 10 meter or longer fiber optic cable and the SMPS powered FMC is spaced well away from your system, most people like it. This is an area where powering the FMC with a Linear Power Supply can help. But Linear Powersupplies have their signature too. We have customer feedback where they replaced that SMPS with a LPS and are not happy with the result.
-The much discussed SOTM switch. We have customers loving them, we also have customers who bought and stacked 2 of them and tell me it sounds different but they do not enjoy what it does.
-Then we have customers who bought an audiophile copper network cable and they don’t like fiber at all.
The bottom line is, networking is tweakable, it makes a difference, but there is no universally applicable recommendation. The only consistent factor is the differences network tweaking makes is less influential with the Extreme compared to other less purposely built machines.

We do consider all options as “tweaks”. We have not heard any networking setup make a bigger difference than a high quality USB cable, nor good anti vibration measures, when applied to the Extreme. This does not imply USB cables or anti vibration measures render networking tweaks invalid. We are only mentioning this to put some perspective to this, and we do take pride in the fact that we have managed to reduce networking influence on performance.

We have tested several cards and SFP+ modules with varying degrees of success. One of the issues is the better sounding SFP+ modules do not work with the better sounding network cards, or they don’t work with the QSW-308.

Copper DAC (Direct Attach Cables) are promising performers but the passive ones only work up to 7 meters, active up to 15 meters.

Most of the Extreme owners use this:

Shopping list:
1 x Startech ET91000SFP2 Fiber Media Converter
2 x Finisar FTLF1324P2BTL-MC SFP modules
1 x Fiber cable at needed length, 9/125 OS2 low loss quality, LC connectors on both sides

1) connect the Startech ET91000SFP2 Fiber Media Converter to either main switch or ISP router with a copper UTP cable
2) insert 1 FTLF1324P2BTL-MC SFP Module into the Startech ET91000SFP2 FMC SFP slot
3) insert 1 FTLF1324P2BTL-MC SFP Module into the Extreme SFP slot
4) connect Fiber cable to the SFP modules mentioned in Step 2 and 3

All of them are thrilled with the improvement over a regular copper UTP cable between server and switch or ISP router, none of them had any issues getting it to work.

COMPLETELY optional tweaks:
-Replace FMC power supply with a high quality LPS
-Replace Ethernet copper cable from FMC to switch/ISP router with an audiophile version of your choice
-Replace Switch (if used) with an audiophile version
-Replace Switch and/or ISP router powersupplies with a high quality LPS

We highlighted COMPLETELY because most are already so satisfied with the performance at this stage they are not that eager to try and squeeze out more.

The SFP modules dominate the results. Whether SFP (1Gb) or SFP+ (10Gb), Finisar SFP modules sound by far the best to us, with the best colour saturation and more contrast then the others, they also consistently capture more atmosphere / ambiance and create a more 3-dimensional sound. Single mode sounds better then multi mode. Longer range sounds better then shorter. Please note that officially 40 and 80km ranges need attenuators on shorter ranges to reduce laser transmission power. They may also contribute to an overall better sound in some systems when used with relatively short fiber cables.

Connecting the server through a dedicated switch, not shared with other devices is a very clear step up in sound quality, a FMC (Fibre Media Converter) accomplishes the same thing. This is nothing new, a lot of audiophiles are already using stacked switches. With earlier incarnations of the Extreme server firmware, we preferred the direct fiber connection into the Extreme by a small margin over one via an extra switch. With our latest firmware, however, we currently prefer using Copper over Fiber.

Additionally, we will soon release a new range of products that will further enhance the quality of the network connection:

Taiko Audio Extreme Switch
Taiko Audio Extreme Network Card
Taiko Audio Extreme Router (to be announced)


-relative to a high quality powercord versus a stock powercord 35%
-relative to good equipment support 25%
-relative to high quality USB cables 30%

Creating a separate subnet for your streaming environment can be beneficial. This is more advanced to setup. The easiest way is to use 2 routers, 1 serving your streaming network, the other your home network. When using a submask of you can create 2 subnets by for example placing your streaming devices in 192.168.1.x and your other network devices in 192.168.2.x. You can also go more advanced by means of a managed switch where you can create separate vlans which works out at about the same. The downside is more network devices have a negative effect again. Due to complexity of setup and unpredictable results, We do not recommend going here unless you enjoy playing around with this.


There is quite a bit more to AC power then meets the eye. We have voltage and current distortion creating harmonic distortion on power grids. AC power voltage is a sinusoidal waveform which would ideally be a single 60Hz (US) shape. But it is far from ideal, the waveform gets distorted by for example current draw and when it passes through components harmonics are created. 2nd order harmonic would be 2*60Hz =120Hz, 3rd order 180Hz, but it does not stop there, you can easily find up to 50th order harmonics on a powerline, so 3000Hz, and it can go up much much higher then that. All these harmonics are not very useable for our power supplies, but do still “carry power (voltage and current)” and travel our grids. A fuse is basically a resistor, it is quite sensitive to all these harmonics and these will heat it up. It could actually blow just from harmonic distortion while current draw at 60Hz would be well below its rating. Being a resistor it will add its own harmonic distortion. All this distortion is going to be audible. You can create a power supply less sensitive to this, this usually starts at the transformer selection and/or AC line filtering. A fuse would sit in front of all of this though. You could view it as a vibration source, better damped fuses may generate less harmonics, or they may be more resilient to harmonic distortion. And/or these fuses may have a harmonic distortion pattern more favourable to ones hearing and taste.

Everything degrades under stress, powering on a power supply with a large toroidal transformer and capacitor bank produces two significant current peaks, this can easily go over 100A (stress), the toroidal a very steep initial spike, subsequently the capacitor bank draws high current over a longer time period while charging, can be as high as 50A. There are fuses specifically designed to deal with toroidal transformer current surges, those unfortunately have a higher then normal unfavourable effect on sound quality. So it is good practice to soft start this type of power supply to 1) reduce the transformer current peak, 2) reduce the capacitor bank initial charging current, 3) reduce stress on the fuse, 4) reduce stress on the capacitor bank, prolonging life on all these components. Soft starting eliminates the transformer initial surge and cuts the magnitude of capacitor charging current in half in exchange for doubling the charging time. Another side effect is that capacitor charging stress affects sound quality for a considerable time after powering on the supply, soft starting reduces this “warm up” time.

It is these peaks which are the problem why some audiophile fuses can deteriorate early resulting in a lifeless undynamic sound. They can blow after just a few power cycles. If we do not soft start these type of power supplies even very rugged fuses only survive a few power cycles and they audibly deteriorate each power on cycle. We have in fact tested and verified this and it was part of the selection process for this fuse, it will have a healthy long life and can even handle more extreme surges which can happen during power grid anomalies. Obviously we are not going to test all the audiophile fuse variations out there.

It is important to be aware that audiophile fuses can be deliberately overrated (lower resistance, meaning lower heating, lower distortion, which can actually sound better), some are not but blow too fast causing audiophiles to use a larger value, some audiophiles deliberately use larger values as that can sound better, this is actually rather dangerous and can cause significant damage during situations where a normal fuse would blow, it can even cause a real fire as if the fuse not blows while the power supply is delivering more current then what it was designed for, it will keep heating up till it finally fails, or till something reaches ignition temperature. The fuse also protects against power grid anomalies like repetitive voltage swings, brownouts etc, you just don’t know what’s going on when you change the fuse to a different type / rating.

The fuse we use has been selected in an early stage of the design for low distortion and reliability. It’s not an “audiophile” fuse but an industrial heavy duty type. It will survive numerous power on cycles and not degrade. In audiophile terms it excels in dynamic expression and images large. The Extreme has been voiced around this fuse in what we consider to be neutral. Changing it obviously does have an impact on voicing.

2A @220-240V
4A @110-120V


We hold HQ Player, and its creator Jussi, in very high regard. You can get remarkable results from a whole range of relatively affordable DACs by using HQ Player algorithms rivaling much more costly solutions.

We designed the original SGM 2015 Music Server to do precisely that, and extract top-level performance from affordable DACs, using HQ Player algorithms. The value for money equation is beyond dispute.

The current upper market segment DACs do a much better job at “bit perfect” playback than a few years ago. We are following suit with the Extreme Music Server and we now primarily focus on achieving “bit perfect” playback using Taiko-proprietary hardware, -software, and a fully customized USB driver. As a result, we no longer use HQ Player on the Extreme.

But we are in no way suggesting nor implying HQ Player is an invalid approach.