What now G.722? SILK Speech Codec to rule them all?

by Ruben Email

Some good news from the read worthy Skype Journal: Skype SILK codec in the IETF standards process.

This IS major news. As Phil Wolff points out, one of the three obstacles for Skype is now being solved. In my opinion, the codec issue has been one of the most difficult for Skype - along with the Joltid issues.

From my perspective, one of the good thing about the SILK Speech Codec being put through an ITEF track is that the codec is now becoming open and usable for more than Skype users.

This could mean that in a not so distant future we will have SILK for Asterisk, making Skype For Asterisk much more interesting.

What really is worth taking a small peek at, is how this will effect the "other" Wideband Audio codec, namely G.722; But also the Wideband Audio eco system as a whole.

In what way could, or will, SILK compete with G.722? In one of the VoIP Users Conference calls someone said that G.722 is reasonable easy to implement compared to certain other codecs - but let's not forget that codec implementation is done once per unit/project. As far as I understand the SILK home page - there will be no source code available, but the SILK SDK is easily available. Given that the G.722 patent has expired, it's use should be comparable to the relaxed license Skype have in mind for SILK. What will become interesting if there will be any 3rd party implementations of SILK. This of course will depend on how many architectures the SDK will be available to; Currently x86 Windows or ARM Linux.

What about the eco system around Wideband Audio - will the eco system as a whole care? My guess is only hard core VoIP people will care - the rest of the eco system will be more concerned about the applications. However, what could happen when SILK becomes an IETF recommendation, is that we will end up with two islands of Wideband Audio: SILK and G.722. I really do hope this will not happen, we are on the verge on general deployment of Wideband Audio. We, as VoIP professionals, can simply not afford a Wideband Audio "split".

I do believe that for quite some time we will find SILK only in non-Skype software, but not in the major hardware vendors phone offerings. My guess is that SILK will first be supported by smaller hardware vendors (i.e. "Designed & Made in China). Of the more well known phone producers, Snom will probably one of the first.

A short background into SILK

The codec is developed by the Skype Audio team. The bitrate varies from 6 to 40 kilobit/second. Comparably, G.722 operate at 48, 56 and 64 kilobit/second.

Given that everything boils down to the bandwidth it's worth noting that SILK operates on 8, 16 and 24 kHz - this is higher than G.722. We should also note that SILK reaches a Mean Opinion Score (G.107) of 4.49 faster than other Wideband codecs (i.e. lower kilobit/second) - actually, some of the other Wideband codecs will not reach MOS=4.49. These numbers are taken from official SILK literature.

How about packet loss? When we loose packets, the MOS falls. What I have seen so far is that if the packet loss reaches 10%, the MOS is still much higher than comparable Wideband codecs.

Concluding remark

We should embrace SILK even if it's "made by Skype".

Skype is not "evil" and "closed" any more. Skype is opening up (Skype for Asterisk / Skype for Sip) - so releasing their codec into the wild is just the next logical step.

5 comments

Comment from: Michael Graves [Visitor] · http://www.mgraves.org
Can you point me to exactly where you see that source code will be released? Everything that I've found indicates a closed source model.
23/09/09 @ 08:58
Comment from: Ruben [Member] Email
Hi Michael.

Thanks for the comment.

You got me there, and found a big typo. Thank you very much for spotting this.

What I really wanted to write is that there will be no source code available.

I really do not see Skype going open source with the SILK codec pretty soon, but what I really hope for is that given the IETF track the codec is put through, maybe - just maybe, a 3rd party will do an implementation not far into the future.

Actually I have hopes for a 3rd party doing an implementation: The reason is that if a protocol to become an Internet standard there must be
1) at least two independent and interoperable implementations from different code bases (elevates to becoming a "Draft Standard"),
2) a high degree of technical maturity and by a generally held belief that the specified protocol or service provides significant benefit to the Internet community (become a "Internet Standard").

The current IETF document describing SILK has, as far as I understand, not yet reached the first IETF standards track maturity level ("Proposed Standard"). This maturity level states: A Proposed Standard specification is generally stable, has resolved known design choices, is believed to be well-understood, has received significant community review, and appears to enjoy enough community interest to be considered valuable. SILK has yet a long way to go before calling it self a Internet Standard.
23/09/09 @ 09:26
Comment from: Michael Graves [Visitor] · http://www.mgraves.org
Yes, that's as I thought. There is potentially a good there there, but it's a yet unrealized potential.

SILK starts out back of the pack in that there are no devices supporting it. Hardware device support is crucial. That alone is what drives G.722, despite its lack of technical sophistication.

We already have several royalty-free standard wideband codecs in G.722, G722.1 & G.722.1C. Royalty free is not enough on it's own.
23/09/09 @ 18:50
Comment from: Travis [Visitor]
Nothing on VoIP with compression can reach MOS 5, on VoIP telephony, why? because of bandwidth and money and voice band issues, are you kidding? if you classify SILK as 5, what is the classification of your voice media on Media Players such as MP3 and even raw quality, compression in this context means ignoring much details as it sound good or fair to the listener, by definition you are cutting out details, this is why Voice band is used, so how come you got this idea of MOS 5? and you call this professional idea?
23/02/10 @ 15:05
Comment from: Ruben [Member] Email
Hi Travis,

I completely agree with you that if you have a lossy compression scheme, MOS=5 can not be achieved. My bad for not being clearer.

What I probably meant to say is NOT that SILK as the time being is able to perform MOS=5 - but will probably reach MOS=5 in the future - faster than other wide band codecs.

What is true today, based on published data from Skype, is that SILK reaches MOS=4.49 faster than other wide band codecs. I have now amended the article to better suit the current state of things.

Thank you for pointing this error out to me, much appreciated.
23/02/10 @ 15:28

Leave a comment


Your email address will not be revealed on this site.

Your URL will be displayed.
(Line breaks become <br />)
(Name, email & website)
(Allow users to contact you through a message form (your email will not be revealed.)