Planet Xiph

February 06, 2010

Silvia Pfeiffer

View counts on YouTube contradictory

UPDATE (6th February 2010): YouTube have just reacted to my bug and it seems there are some gData links that are more up-to-date than others. You need to go with the “uploads” gData APIs rather than the search or user ones to get accurate data. Glad YouTube told me and it’s documented now!

I am an avid user of YouTube Insight, the metrics tool that YouTube provides freely to everyone who publishes videos through them. YouTube Insight provides graphs on video views, the countries they originate in, demographics of the viewership, how the videos are discovered, engagement metrics, and hotspot analysis. It is a great tool to analyse the success of your videos, determine when to upload the next one, find out what works and what doesn’t.

However, you cannot rely on the accuracy of the numbers that YouTube Insight displays. In fact, YouTube provides three different means to find out what the current views (and other statistics, but let’s focus on the views) are for your videos:

  • the view count displayed on the video’s watch page
  • the view count displayed in YouTube Insight
  • the view count given in the gData API feed

The shocking reality is: for all videos I have looked at that are less than about a month old and keep getting views, all three numbers are different.

Sometimes they are just off by one or two, which is tolerable and understandable, since the data must be served from a number of load balanced servers or even server clusters and it would be difficult to keep all of these clusters at identical numbers all of the time.

However, for more than 50% of the videos I have looked at, the numbers are off by a substantial amount.

I have undertaken an analysis with random videos, where I have collected the gData views and the watch page views. The Insight data tends to be between these two numbers, but I cannot generally reach that data, so I have left it out of this analysis.

Here are the stats for 36 randomly picked videos in the 9 view-count classes defined by TubeMogul and by how much they are off at the time that I looked at them:

Class Video watch page gData API age diff percentage
>1M 1 7,187,174 6,082,419 2 weeks 1,104,755 15.37%
>1M 2 3,196,690 3,080,415 3 weeks 116,275 3.63%
>1M 3 2,247,064 1,992,844 1 week 254,220 11.31%
>1M 4 1,054,278 1,040,591 1 month 13,687 1.30%
100K-500K 5 476,838 148,681 11 days 328,157 68.82%
100K-500K 6 356,561 294,309 2 weeks 62,252 17.46%
100K-500K 7 225,951 195,159 2 weeks 30,792 13.63%
100K-500K 8 113,521 62,241 1 week 51,280 45.17%
10K-100K 9 86,964 46 4 days 86,918 99.95%
10K-100K 10 52,922 43,548 3 weeks 9,374 17.71%
10K-100K 11 34,001 33,045 1 month 956 2.81%
10K-100K 12 15,704 13,653 2 weeks 2,051 13.06%
5K-10K 13 9,144 8,967 1 month 117 1.94%
5K-10K 14 7,265 5,409 1 month 1,856 25.55%
5K-10K 15 6,640 5,896 2 weeks 744 11.20%
5K-10K 16 5,092 3,518 6 days 1,574 30.91%
2.5K-5K 17 4,955 4,928 3 weeks 27 0.91%
2.5K-5K 18 4,341 4,044 4 days 297 6.84%
2.5K-5K 19 3,377 3,306 3 weeks 71 2.10%
2.5K-5K 20 2,734 2,714 1 month 20 0.73%
1K-2.5K 21 2,208 2,169 3 weeks 39 1.77%
1K-2.5K 22 1,851 1,747 2 weeks 104 5.62%
1K-2.5K 23 1,281 1,244 1 week 37 2.89%
1K-2.5K 24 1,034 984 2 weeks 50 4.84%
500-1K 25 999 844 6 days 155 15.52%
500-1K 26 891 790 6 days 101 11.34%
500-1K 27 861 600 3 days 17 30.31%
500-1K 28 645 482 4 days 163 25.27%
100-500 29 460 436 10 days 24 5.22%
100-500 30 291 285 4 days 6 2.06%
100-500 31 256 198 3 days 58 22.66%
100-500 32 196 175 11 days 21 10.71%
0-100 33 88 74 10 days 14 15.90%
0-100 34 64 49 12 days 15 23.44%
0-100 35 46 21 5 days 25 54.35%
0-100 36 31 25 3 days 4 19.35%

The videos were chosen such that they were no more than a month old, but older than a couple of days. For older videos than about a month, the increase had generally stopped and the metrics had caught up, unless where the views were still increasing rapidly, which is an unusual case.

Generally, it seems that the host page has the right views. In contrast, it seems the gData interface is updated only once every week. It further seems from looking at YouTube channels where I have access to Insight that Insight is updated about every 4 days and it receives corrected data for the days in which it hadn’t caught up.

Further, it seems that YouTube make no differentiation between channels of partners and general users’ channels – both can have a massive difference between the watch page and gData. Most videos differ by less than 20%, but some have exceptionally high differences above 50% and even up to 99.95%.

The difference is particularly pronounced for videos that show a steep increase in views – the first few days tend to have massive differences. Since these are the days that are particularly interesting to monitor for publishers, having the gData interface lag behind this much is shocking.

Further, videos with a low number of views, in particular less than 100, also show a particularly high percentage in difference – sometimes an increase in view count isn’t reported at all in the gData API for weeks. It seems that YouTube treats the long tail worse than the rest of YouTube. For every video in this class, the absolute difference will be small – obviously less than 100 views. With almost 30% of videos being such videos, it is somewhat understandable that YouTube are not making the effort to update their views regularly. OTOH, these views may be particularly important to their publishers.

It seems to me that YouTube need to change their approach to updating statistics across the watch pages, Insight and gData.

Firstly, it is important to have the watch page, Insight and gData in sync – otherwise what number would you use in a report? If the gData API for YouTube statistics lags behind the watch page and Insight by even 24 hours, it is useless in indicating trends and for using in reports and people have to go back to screenscraping to gain information on the actual views of their videos.

Secondly, it would be good to update the statistics daily during the first 3-4 weeks, or as long as the videos are gaining views heavily. This is the important time to track the success of videos and if neither Insight nor gData are up to date in this time, and can even be almost 100% off, the statistics are actually useless.

Lastly, one has to wonder how accurate the success calculations are for YouTube partners, who rely on YouTube reporting to gain payment for advertising. Since the analysis showed that the inaccuracies extend also into partner channels, one has to hope that the data that is eventually reported through Insight is actually accurate, even if intermittently there are large differences.

Finally, I must say that I was rather disappointed with the way in which this issue has so far been dealt with in the YouTube Forums. The issues about wrongly reported view counts has been reported first more than a year ago and since regularly by diverse people. Some of the reports were really unfriendly with their demands. Still, I would have expected a serious reply by a YouTube employee about why there are issues and how they are going to be fixed or whether they will be fixed at all. Instead, all I found was a more than 9 month old mention that YouTube seems to be aware of the issue and working on it – no news since.

Also, I found no other blog posts analysing this issue, so here we are. Please, YouTube, let us know what is going on with Insight, why are the numbers off by this much, and what are you doing to fix it?

NB: I just posted a bug on gData, since we were unable to find any concrete bugs relating to this issue there. I’m actually surprised about this, since so many people reported it in the YouTube Forums!

by silvia at February 06, 2010 02:07 AM

February 03, 2010

Ben Schwartz

No, you can’t do that with H.264

A lot of commercial software comes with H.264 encoders and decoders, and some computers arrive with this software preinstalled. This leads a lot of people to believe that they can legally view and create H.264 videos for whatever purpose they like. Unfortunately for them, it ain’t so.

Maybe the best example comes from the Final Cut Pro license:

To the extent that the Apple Software contains AVC encoding and/or decoding functionality, commercial use of H.264/AVC requires additional licensing and the following provision applies: THE AVC FUNCTIONALITY IN THIS PRODUCT IS LICENSED HEREIN ONLY FOR THE PERSONAL AND NON-COMMERCIAL USE OF A CONSUMER TO (i) ENCODE VIDEO IN COMPLIANCE WITH THE AVC STANDARD (“AVC VIDEO”) AND/OR (ii) DECODE AVC VIDEO THAT WAS ENCODED BY A CONSUMER ENGAGED IN A PERSONAL AND NON-COMMERCIAL ACTIVITY AND/OR AVC VIDEO THAT WAS OBTAINED FROM A VIDEO PROVIDER LICENSED TO PROVIDE AVC VIDEO. INFORMATION REGARDING OTHER USES AND LICENSES MAY BE OBTAINED FROM MPEG LA L.L.C. SEE HTTP://WWW.MPEGLA.COM.

The text could hardly be clearer: you do not have a license for commercial use of H.264. Call it “Final Cut Pro Hobbyist”. Do you post videos on your website that has Google Adwords? Do you edit video on a consulting basis? Do you want to include a video in a package sent to your customers? Do your clients send you video clips as part of your business? Then you’re using the encoder or decoder for commercial purposes, in violation of the license.

Now, you might think “but I’m sticking with MPEG-4, or MPEG-2, so it’s not a problem for me”. No. It’s just as bad. Here’s the relevant section of the license:

13. MPEG-2 Notice. To the extent that the Apple Software contains MPEG-2 functionality, the following provision applies: ANY USE OF THIS PRODUCT OTHER THAN CONSUMER PERSONAL USE IN ANY MANNER THAT COMPLIES WITH THE MPEG-2 STANDARD FOR ENCODING VIDEO INFORMATION FOR PACKAGED MEDIA IS EXPRESSLY PROHIBITED WITHOUT A LICENSE UNDER APPLICABLE PATENTS IN THE MPEG-2 PATENT PORTFOLIO, WHICH LICENSE IS AVAILABLE FROM MPEG LA, L.L.C., 250 STEELE STREET, SUITE 300, DENVER, COLORADO 80206.
14. MPEG-4 Notice. This product is licensed under the MPEG-4 Systems Patent Portfolio License for encoding in compliance with the MPEG-4 Systems Standard, except that an additional license and payment of royalties are necessary for encoding in connection with (i) data stored or replicated in physical media which is paid for on a title by title basis and/or (ii) data which is paid for on a title by title basis and is transmitted to an end user for permanent storage and/or use. Such additional license may be obtained from MPEG LA, LLC. See http://www.mpegla.com for additional details. This product is licensed under the MPEG-4 Visual Patent Portfolio License for the personal and non-commercial use of a consumer for (i) encoding video in compliance with the MPEG-4 Visual Standard (“MPEG-4 Video”) and/or (ii) decoding MPEG-4 video that was encoded by a consumer engaged in a personal and non-commercial activity and/or was obtained from a video provider licensed by MPEG LA to provide MPEG-4 video. No license is granted or shall be implied for any other use. Additional information including that relating to promotional, internal and commercial uses and licensing may be obtained from MPEG LA, LLC.

Noticing a pattern? You have a license to use their software, provided you don’t make any money, your friends are also all correctly licensed, and you only produce content that complies with the MPEG standard. Using video for a commercial purpose? Producing video that isn’t within MPEG’s parameters? Have friends who use unlicensed encoders like x264, ffmpeg, or xvid? Too bad.

This last thing is actually a particularly interesting point. If you encode a video using one of these (open-source) unlicensed encoders, you’re practicing patents without a license, and you can be sued. But hey, maybe you’re just a scofflaw. After all, it’s not like you’re making trouble for anyone else, right? Wrong. If you send a video to a friend who uses a licensed decoder, and they watch it, you’ve caused them to violate their own software license, so they can be sued too.

Oh, and in case you thought this was specific to Apple, here’s the matching piece from the Windows 7 Ultimate License:

18. NOTICE ABOUT THE H.264/AVC VISUAL STANDARD, THE VC-1 VIDEO STANDARD, THE MPEG-4 VISUAL STANDARD AND THE MPEG-2 VIDEO STANDARD. This software includes H.264/AVC, VC-1, MPEG-4 Part 2, and MPEG-2 visual compression technology. MPEG LA, L.L.C. requires this notice:
THIS PRODUCT IS LICENSED UNDER THE AVC, THE VC-1, THE MPEG-4 PART 2 VISUAL, AND THE MPEG-2 VIDEO PATENT PORTFOLIO LICENSES FOR THE PERSONAL AND NON-COMMERCIAL USE OF A CONSUMER TO (i) ENCODE VIDEO IN COMPLIANCE WITH THE ABOVE STANDARDS (“VIDEO STANDARDS”) AND/OR (ii) DECODE AVC, VC-1, MPEG-4 PART 2 AND MPEG-2 VIDEO THAT WAS ENCODED BY A CONSUMER ENGAGED IN A PERSONAL AND NON-COMMERCIAL ACTIVITY OR WAS OBTAINED FROM A VIDEO PROVIDER LICENSED TO PROVIDE SUCH VIDEO. NONE OF THE LICENSES EXTEND TO ANY OTHER PRODUCT REGARDLESS OF WHETHER SUCH PRODUCT IS INCLUDED WITH THIS PRODUCT IN A SINGLE ARTICLE. NO LICENSE IS GRANTED OR SHALL BE IMPLIED FOR ANY OTHER USE. ADDITIONAL INFORMATION MAY BE OBTAINED FROM MPEG LA, L.L.C. SEE WWW.MPEGLA.COM.

Doesn’t seem so Ultimate to me.

My advice: use a codec that doesn’t need a license:

Q. What is the license for Theora?
Theora (and all associated technologies released by the Xiph.org Foundation) is released to the public via a BSD-style license. It is completely free for commercial or noncommercial use. That means that commercial developers may independently write Theora software which is compatible with the specification for no charge and without restrictions of any kind.

by Ben at February 03, 2010 02:40 AM

January 29, 2010

Silvia Pfeiffer

Government Report: “Access to Electronic Media for the Hearing and Vision Impaired”

Today was the last day to provide a submission and input to the Australian Government’s discussion report on “Access to Electronic Media for the Hearing and Vision Impaired: Approaches for Consideration”.

The report explains the Australian Government’s existing regulatory framework for accessibility to audio-visual content on TV, digital TV, DVDs, cinemas, and the Internet, and provides an overview about what it is planning to do over the next 3-5 years.

It is interesting to read that according to the Australian Bureau of Statistics about 2.67 million Australians – one in every eight people – have some form of hearing loss and 284,000 are completely or partially blind. Also, it is expected that these numbers will increase with an ageing population and obesity-linked diabetes are expected to continue to increase these numbers.

For obvious reasons, I was particularly interested in the Internet-related part of the report. It was the second-last section (number five), and to be honest, I was rather disappointed: only 3 pages of the 40 page long report concerned themselves with Internet content. Also, the main message was that “at this time the costs involved with providing captions for online content were deemed to represent an undue financial impost on a relatively new and developing service.”

Audio descriptions weren’t even touched with a stick and both were written off with “a lack of clear online caption production and delivery standard and requirements”. There is obviously a lot of truth to the statements of the report – the Internet audio-visual content industry is still fairly young compared to e.g. TV, and there are a multitude of standards rather than a single clear path.

However, I believe the report neglected to mention the new HTML5 video and audio elements and the opportunity they provide. Maybe HTML5 was excluded because it wasn’t expected to be relevant within the near future. I believe this is a big mistake and governments should pay more attention to what is happening with HTML5 audio and video and the opportunities they open for accessibility.

In the end, I made a submission because I wanted the Australian Government to wake up to the HTML5 efforts and I wanted to correct a mistake they made with claiming MPEG-2 was “not compatible with the delivery of closed audio descriptions”.

I believe a lot more can be done with accessibility for Internet content than just “monitor international developments” and industry partnership with disability representative groups. I therefore proposed to undertake trials in particular with textual audio descriptions to see if they could be produced in a similar manner to captions, which would make their cost come down enormously. Also I suggested actually aiming for WCAG 2.0 conformance within the next 5 years – which for audio-visual content means at minimum captions and audio descriptions.

You can read the report here and my 4 page long submission here.

by silvia at January 29, 2010 12:53 PM

January 27, 2010

Silvia Pfeiffer

The model of a time-linear media resource for HTML5

HTML5 has been criticised for not having a timing model of the media resource in its new media elements. This article spells it out and builds a framework of how we should think about HTML5 media resources. Note: these are my thoughts and nothing offical from HTML5 – just conclusions I have drawn from the specs and from discussions I had.

What is a time-linear media resource?

In HTML5 and also in the Media Fragment URI specification we deal only with audio and video resources that represent a single timeline exclusively. Let’s call such Web resources a time-linear media resource.

The Media Fragment requirements document actually has a very nice picture to describe such resources – replicated here for your convenience:

Model of a Media Resource

The resource can potentially consist of any number of audio, video, text, image or other time-aligned data tracks. All these tracks adhere to a single timeline, which tends to be defined by the main audio or video track, while other tracks have been created to synchronise with these main tracks.

This model matches with the world view of video on YouTube and any other video hosting service. It also matches with video used on any video streaming service.

Background on the choice of “time-linear”

I’ve deliberately chosen the word “time-linear” because we are talking about a single, gap-free, linear timeline here and not multiple timelines that represent the single resource.

The word “linear” is, however, somewhat over-used, since the introduction of digital systems into the world of analog film introduced what is now known as “non-linear video editing”. This term originates from the fact that non-linear video editing systems don’t have to linearly spool through film material to get to a edit point, but can directly access any frame in the footage as easily as any other.

When talking about a time-linear media resource, we are referring to a digital resource and therefore direct access to any frame in the footage is possible. So, a time-linear media resource will still be usable within a non-linear editing process.

As a Web resource, a time-linear media resource is not addressed as a sequence of frames or samples, since these are encoding specific. Rather, the resource is handled abstractly as an object that has track and time dimensions – and possibly spatial dimensions where image or video tracks are concerned. The framerate encoding of the resource itself does not matter and could, in fact, be changed without changing the resource’s time, track and spatial dimensions and thus without changing the resource’s address.

Interactive Multimedia

The term “time-linear” is used to specify the difference between a media resource that follows a single timeline, in contrast to one that deals with multiple timelines, linked together based on conditions, events, user interactions, or other disruptions to make a fully interactive multi-media experience. Thus, media resources in HTML5 and Media Fragments do not qualify as interactive multimedia themselves because they are not regarded as a graph of interlinked media resources, but simply as a single time-linear resource.

In this respect, time-linear media resources are also different from the kind of interactive mult-media experiences that an Adobe Shockwave Flash, Silverlight, or a SMIL file can create. These can go far beyond what current typical video publishing and communication applications on the Web require and go far beyond what the HTML5 media elements were created for. If your application has a need for multiple timelines, it may be necessary to use SMIL, Silverlight, or Adobe Flash to create it.

Note that the fact that the HTML5 media elements are part of the Web, and therefore expose states and integrate with JavaScript, provides Web developers with a certain control over the playback order of a time-linear media resource. The simple functions pause(), play(), and the currentTime attribute allow JavaScript developers to control the current playback offset and whether to stop or start playback. Thus, it is possible to interrupt a playback and present, e.g. a overlay text with a hyperlink, or an additional media resource, or anything else a Web developer can imagine right in the middle of playing back a media resource.

In this way, time-linear media resources can contribute towards an interactive multi-media experience, created by a Web developer through a combination of multiple media resources, image resources, text resources and Web pages. The limitations of this approach are not yet clear at this stage – how far will such a constructed multi-media experience be able to take us and where does it become more complicated than an Adobe Flash, Silverlight, or SMIL experience. The answer to this question will, I believe, become clearer through the next few years of HTML5 usage and further extensions to HTML5 media may well be necessary then.

Proper handling of time-linear media resources in HTML5

At this stage, however, we have already determined several limitations of the existing HTML5 media elements that require resolution without changing the time-linear nature of the resource.

1. Expose structure

Above all, there is a need to expose the above painted structure of a time-linear media resource to the Web page. Right now, when the <video> element links to a video file, it only accesses the main audio and video tracks, decodes them and displays them. The media framework that sits underneath the user agent (UA) and does the actual decoding for the UA might know about other tracks and might even decode, e.g. a caption track and display it by default, but the UA has no means of knowing this happens and controlling this.

We need a means to expose the available tracks inside a time-linear media resource and allow the UA some control over it – e.g. to choose whether to turn on/off a caption track, to choose which video track to display, or to choose which dubbed audio track to display.

I’ll discuss in another article different approaches on how to expose the structure. Suffice for now that we recognise the need to expose the tracks.

2. Separate the media resource concept from actual files

A HTML page is a sequence of HTML tags delivered over HTTP to a UA. A HTML page is a Web resource. It can be created dynamically and contain links to other Web resources such as images which complete its presentation.

We have to move to a similar “virtual” view of a media resource. Typically, a video is a single file with a video and an audio track. But also typically, caption and subtitle tracks for such a video file are stored in other files, possibly even on other servers. The caption or subtitle tracks are still in sync with the video file and therefore are actual tracks of that time-linear media resource. There is no reason to treat this differently to when the caption or subtitle track is inside the media file.

When we separate the media resource concept from actual files, we will find it easier to deal with time-linear media resources in HTML5.

3. Track activation and Display styling

A time-linear media resource, when regarded completely abstractly, can contain all sorts of alternative and additional tracks.

For example, the existing <source> elements inside a video or audio element are currently mostly being used to link to alternative encodings of the main media resource – e.g. either in mpeg4 or ogg format. We can regard these as alternative tracks within the same (virtual) time-linear media resource.

Similarly, the <source> elements have also been suggested to be used for alternate encodings, such as for mobile and Web. Again, these can be regarded as alternative tracks of the same time-linear media resource.

Another example are subtitle tracks for a main media resource, which are currently discussed to be referenced using the <itext> element. These are in principle alternative tracks amongst themselves, but additional to the main media resource. Also, some people are actually interested in displaying two subtitle tracks at the same time to learn translations.

Another example are sign language tracks, which are video tracks that can be regarded as an alternative to the audio tracks for hard-of-hearing users. They are then additional video tracks to the original video track and it is not clear how to display more than one video track. Typically, sign language tracks are displayed as picture-in-picture, but on the Web, where video is usually displayed in a small area, this may not be optimal.

As you can see, when deciding which tracks need to be displayed one needs to analyse the relationships between the tracks. Further, user preferences need to come into play when activating tracks. Finally, the user should be able to interactively activate tracks as well.

Once it is clear, what tracks need displaying, there is still the challenge of how to display them. It should be possible to provide default displays for typical track types, and allow Web authors to override these default display styles since they know what actual tracks their resource is dealing with.

While the default display seems to be typically an issue left to the UA to solve, the display overrides are typically dealt with on the Web through CSS approaches. How we solve this is for another time – right now we can just state the need for algorithms for track activiation and for default and override styling.

Hypermedia

To make media resources a prime citizens on the Web, we have to go beyond simply replicating digital media files. The Web is based on hyperlinks between Web resources, and that includes hyperlinking out of resources (e.g. from any word within a Web page) as well as hyperlinking into resources (e.g. fragment URIs into Web pages).

To turn video and audio into hypervideo and hyperaudio, we need to enable hyperlinking into and out of them.

Hyperlinking into media resources is fortunately already being addressed by the W3C Media Fragments working group, which also regards media resources in the same way as HTML5. The addressing schemes under consideration are the following:

  • temporal fragment URI addressing: address a time offset/region of a media resource
  • spatial fragment URI addressing: address a rectangular region of a media resource (where available)
  • track fragment URI addressing: address one or more tracks of a media resource
  • named fragment URI addressing: address a named region of a media resource
  • a combination of the above addressing schemes

With such addressing schemes available, there is still a need to hook up the addressing with the resource. For the temporal and the spatial dimension, resolving the addressing into actual byte ranges is relatively obvious across any media type. However, track addressing and named addressing need to be resolved. Track addressing will become easier when we solve the above stated requirement of exposing the track structure of a media resource. The name definition requires association of an id or name with temporal offsets, spatial areas, or tracks. The addressing scheme will be available soon – whether our media resources can support them is another challenge to solve.

Finally, hyperlinking out of media resources is something that is not generally supported at this stage. Certainly, some types of media resources – QuickTime, Flash, MPEG4, Ogg – support the definition of tracks that can contain HTML marked-up text and thus can also contain hyperlinks. But standardisation in this space has not really happened yet. It seems to be clear that hyperlinks out of media files will come from some type of textual track. But a standard format for such time-aligned text tracks doesn’t yet exist. This is a challenge to be addressed in the near future.

Summary

The Web has always tried to deal with new extensions in the simplest possible manner, providing support for the majority of current use cases and allowing for the few extraordinary use cases to be satisfied by use of JavaScript or embedding of external, more complex objects.

With the new media elements in HTML5, this is no different. So far, the most basic need has been satisfied: that of including simple video and audio files into Web pages. However, many basic requirements are not being satisfied yet: accessibility needs, codec choice, device-independence needs are just some of the core requirements that make it important to extend our view of <audio> and <video> to a broader view of a Web media resource without changing the basic understanding of an audio and video resource.

This post has created the concept of a “media resource”, where we keep the simplicity of a single timeline. At the same time, it has tried to classify the list of shortcomings of the current media elements in a way that will help us address these shortcomings in a Web-conformant means.

If we accept the need to expose the structure of a media resource, the need to separate the media resource concept from actual files, the need for an approach to track activation, and the need to deal with styling of displayed tracks, we can take the next steps and propose solutions for these.

Further, understanding the structure of a media resources allows us to start addressing the harder questions of how to associate events with a media resource, how to associate a navigable structure with a media resource, or how to turn media resources into hypermedia.

by silvia at January 27, 2010 01:32 AM

January 26, 2010

Silvia Pfeiffer

Tutorial on HTML5 open video at LCA 2010

During last week’s LCA, Jan Gerber, Michael Dale and I gave a 3 hour tutorial on how to publish HTML5 video in an open format.

We basically taught people how to create and publish Ogg Theora video in HTML5 Web pages and how to make them work across browsers, including much of the available tools and libraries. We’re hoping that some people will have learnt enough to include modules in CMSes such as Drupal, Joomla and Wordpress, which will easily support the publishing of Ogg Theora.

I have been asked to share the material that we used. It consists of:

Note that if you would like to walk through the exercises, you should install the following software beforehand:

You might need to look for packages of your favourite OS (e.g. Windows or Mac, Ubuntu or Debian).

The exercises include:

  • creating a Ogg video from an editor
  • transcoding a video using http://firefogg.org/
  • creating a poster image using OggThumb
  • writing a first HTML5 video Web page with Ogg Theora
  • publishing it on a Web Server, with correct MIME type & Duration hint
  • writing a second HTML5 video Web page with Ogg Theora & MP4 to cover Safari/Webkit
  • transcoding using ffmpeg2theora in a script
  • writing a third HTML5 video Web page with Cortado fallback
  • writing a fourth Web page using “Video for Everybody”
  • writing a fifth Web page using “mwEmbed”
  • writing a sixth Web page using firefogg for transcoding before upload
  • and a seventh one with a progress bar
  • encoding srt subtitles into an Ogg Kate track
  • writing an eighth Web page using cortado to display the Ogg Kate track

For those that would like to see the slides here immediately, a special flash embed:

Enjoy!

by silvia at January 26, 2010 11:21 AM

January 25, 2010

Silvia Pfeiffer

HTML5 video: 25% H.264 reach vs. 95% Ogg Theora reach

Vimeo started last week with a HTML5 beta test. They use the H.264 codec, probably because much of their content is already in this format through the Flash player.

But what really surprised me was their claim that roughly 25% of their users will be able to make use of their HTML5 beta test. The statement is that 25% of their users use Safari, Chrome, or IE with Chrome Frame. I wondered how they got to that number and what that generally means to the amount of support of H.264 vs Ogg Theora on the HTML5-based Web.

According to Statcounter’s browser market share statistics, the percentage of browsers that support HTML5 video is roughly: 31.1%, as summed up from Firefox 3.5+ (22.57%), Chrome 3.0+ (5.21%), and Safari 4.0+ (3.32%) (Opera’s recent release is not represented yet).

Out of those 31.1%,

8.53% browsers support H.264

and

27.78% browsers support Ogg Theora.

Given these numbers, Vimeo must assume that roughly 16% of their users have Chrome Frame in IE installed. That would be quite a number, but it may well be that their audience is special.

So, how is Ogg Theora support doing in comparison, if we allow such browser plugins to be counted?

With an installation of XiphQT, Safari can be turned into a browser that supports Ogg Theora. The Chome Frame installation will also turn IE into a Ogg Theora supporting browser. These could get the browser support for Ogg Theora up to 45%. Compare this to a claimed 48% of MS Silverlight support.

But we can do even better for Ogg Theora. If we use the Java Cortado player as a fallback inside the video element, we can capture all those users that have Java installed, which could be as high as 90%, taking Ogg Theora support potentially up to 95%, almost up to the claimed 99% of Adobe Flash.

I’m sure all these numbers are disputable, but it’s an interesting experiment with statistics and tells us that right now, Ogg Theora has better browser support than H.264.

UPDATE: I was told this article sounds aggressive. By no means am I trying to be aggressive – I am stating the numbers as they are right now, because there is a lot of confusion in the market. People believe they reach less audience if they publish in Ogg Theora compared to H.264. I am trying to straighten this view.

by silvia at January 25, 2010 10:02 AM

January 20, 2010

Monty

Why is it so hard to take a good bio pic?

Camilla and I spent a while taking 'headshot' pics because I need one for a bio. Out of all the pics, guess which was the only one that looked decent...

by monty@xiph.org at January 20, 2010 02:12 AM

January 19, 2010

Silvia Pfeiffer

Video Streaming from Linux.conf.au

You probably heard it already: Linux.conf.au is live streaming its video in a Microsoft proprietary format.

Fortunately, there is now a re-broadcast that you can get in an open format from http://stream.v2v.cc:8000/ . It comes from a server in Europe, but relies on transcoding here in New Zealand, so it may not be completely reliable.

UPDATE: A second server is now also available from the US at http://repeater.xiph.org:8000/.

Today, the down under open source / Linux conference linux.conf.au in Wellington started with the announcement that every talk and mini-conf will be live streamed to the Internet and later published online. That’s an awesome achievement!

However, minutes after the announcement, I was very disappointed to find out that the streams are actually provided in a proprietary format and through a proprietary streaming protocol: a Microsoft streaming service that provides Windows media streams.

Why stream an open source conference in a proprietary format with proprietary software? If we cannot use our own technologies for our own conferences, how will we get the rest of the world to use them?

I must say, I am personally embarrassed, because I was part of several audio/video teams of previous LCAs that have managed to record and stream content in open formats and with open media software. I would have helped get this going, but wasn’t aware of the situation.

I am also the main organiser of the FOMS Workshop (Foundations of Open Media Software) that ran the week before LCA and brought some of the core programmers in open media software into Wellington, most of which are also attending LCA. We have the brains here and should be able to get this going.

Fortunately, the published content will be made available in Ogg Theora/Vorbis. So, it’s only the publicly available stream that I am concerned about.

Speaking with the organisers, I can somewhat understand how this came to be. They took the “easy” way of delegating the video work to an external company. Even though this company is an expert in open source and networking, their media streaming customers are all using Flash or Windows media software, which are current de-facto standards and provide extra features such as DRM. It seems apart from linux.conf.au there were no requests on them for streaming Ogg Theora/Vorbis yet. Their existing infrastructure includes CDN distribution and CDN providers certainly typically don’t provide Ogg Theora/Vorbis support or Icecast streaming.

So, this is actually a problem founded in setting up streaming through a professional service rather than through the community. The way in which this was set up at other events was to get together a group of volunteers that provided streaming reflectors for free. In this way, a community-created CDN is built that can deal with the streams. That there are no professional CDN providers available yet that provide Icecast support is a sign that there is a gap in the market.

But phear not – a few of the FOMS folk got together to fix the situation.

It involved setting up Icecast streams for each room’s video stream. Since there is no access to the raw video stream, there is a need to transcode the video from proprietary codecs to the open Ogg Theora/Vorbis format.

To do this legally, a purchase of the codec libraries from Fluendo was necessary, which cost a whopping EURO 28 and covers all the necessary patent licenses. The glue to get the videos from mms to icecast streams is a GStreamer pipeline which I leave others to talk about.

Now, we have all the streams from the conference available as Ogg Theora/Video streams, we can also publish them in HTML5 video elements. Check out this Web page which has all the video streams together on a single page. Note that the connections may be a bit dodgy and some drop-outs may occur.

Further, let me recommend the Multimedia Miniconf at linux.conf.au, which will take place tomorrow, Tuesday 19th January. The Miniconf has decided to add a talk about “How to stream you conference with open codecs” to help educate any potential future conference organisers and point out the software that helps solve these issues.

UPDATE: I should have stated that I didn’t actually do any of the technical work: it was all done by Ralph Giles, Jan Gerber, and Jan Schmidt.

by silvia at January 19, 2010 12:38 AM

January 16, 2010

Cristian Adam

IE <video> tag

I have started hacking a <video> tag implementation for Microsoft Internet Explorer, based on the work of Vladimir Vukićević's IECanvas experiment. Source code is located here.

The AxPlayer will use the DirectShow OggCodecs for the actual video playback. At the moment it doesn't do much, it just displays a gray rectangle where the video should be, as seen below:



Internet Explorer versions 6.0 and 7.0 required the following syntax for Binary Element Behaviour components:
<html xmlns:html5>
<head>
<object id="videoFactory" classid="clsid:7cc95ae6-c1fa-40cc-ab17-3e91da2f77ca"></object>
<?import namespace="html5" implementation="#videoFactory"?>
</head>
<body>

<p>This is a header </p>

<html5:video src="http://videos.mozilla.org/firefox/3.5/meet/meet.ogv">
<p>your browser cannot handle video tag</p>
</html5:video>

<p>This is a footer </p>

</body>
</html>

Internet Explorer 8 has an improved syntax:
<html>
<body>

<p>This is a header </p>

<video src="http://videos.mozilla.org/firefox/3.5/meet/meet.ogv"
xmlns="http://www.w3.org/1999/xhtml">
<p>your browser cannot handle video tag.</p>
</video>

<p>This is a footer </p>

</body>
</html>


The <object> information has been moved into Windows Registry instead of html code:


The xmlns needs to be there in order to link <video> tag to AxPlayer, but the code looks way better :)


Next steps would be:

  • Actually display a video instead of a gray rectangle

  • Modify the OggCodecs source filter to proper handle videos from network

  • Implement the <video> and <audio> W3C HTML5 tag specifications

  • Improve the user experience by implementing all it's needed for Internet Explorer not to display the ActiveX warning popups.

by Cristian Adam (noreply@blogger.com) at January 16, 2010 12:38 AM