VAAPI (non-nvidia) really working yet? - AVS Forum
Forum Jump: 
 
Thread Tools
post #1 of 16 Old 09-04-2011, 10:51 AM - Thread Starter
Member
 
GatoViejo's Avatar
 
Join Date: Oct 2005
Posts: 66
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Liked: 10
With the latest ABI breakage between the nvidia blob and Xorg 1.11, I am once again looking to see if there is a working alternative.

Does anyone have a working combination of hardware and software with hardware assisted video and a non-proprietary driver? I haven't checked for a few years and I just wonder if anything has gotten past the "all promises" stage.
GatoViejo is offline  
Sponsored Links
Advertisement
 
post #2 of 16 Old 09-04-2011, 11:39 AM
AVS Special Member
 
tux99's Avatar
 
Join Date: Jan 2005
Location: Europe
Posts: 1,523
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Liked: 11
Quote:
Originally Posted by GatoViejo View Post
Does anyone have a working combination of hardware and software with hardware assisted video and a non-proprietary driver?
Old Cat, AFAIK the only non proprietary drivers that have hardware assisted video are the Intel drivers and the Broadcom Crystal HD. I did manage to get the Crystal HD working with XBMC quite a while ago, but it wasn't flawless, I don't know how much it has improved since but in any case the Broadcom Crystal HD is more of a niche product anyway.
With regards to Intel GPUs and VA-API I have no experience, but from what I read it's still all work in progress, some stuff works others not so.

Personally I prefer sticking with Nvidia and VDPAU, it just works!

My Linux news / reviews / tips+tricks / downloads web site: http://www.linuxtech.net/
tux99 is offline  
post #3 of 16 Old 09-10-2011, 07:54 PM
Advanced Member
 
CityK's Avatar
 
Join Date: Nov 2002
Posts: 869
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Liked: 13
Quote:
Does anyone have a working combination of hardware and software with hardware assisted video and a non-proprietary driver? I haven't checked for a few years and I just wonder if anything has gotten past the "all promises" stage.
Things appear to be moving along on the ATI front:
http://www.phoronix.com/scan.php?pag...item&px=OTY2OQ ... (be sure to read that post's associated forum thread for some important usage details and "quality" assessments).

I'm not following this closely, so I can't say what further progress has been made. MESA 7.12 is still a ways out (Jan '12). Which likely means -- unless you dig rolling your own kernels, MESA, DDX, blah blah blah (insert associated necessary components here) -- that out of the box acceleration won't be seen by most folks until the mid '12 distro releases.


(Some further quick read background posts:
http://www.phoronix.com/scan.php?pag...item&px=OTY1OQ
http://www.phoronix.com/scan.php?pag...item&px=ODcxOA)
CityK is offline  
post #4 of 16 Old 09-11-2011, 02:14 PM
AVS Special Member
 
Mac The Knife's Avatar
 
Join Date: Oct 2003
Location: Phoenix, AZ
Posts: 4,903
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Liked: 23
I'm running the open source radeon drivers on an integrated 4290 chipset.

Most (all?) of the MPeg2 and 4 hardware assisted decoding works. But other things do not work, such as deinterlacing so I'm still using software deinterlacing. But I also have a lot of CPU (an AMD 965BE) so it doesn't really matter.

If you really need good openGL performance, this is a terrible setup, but in my case I really only care about video performance and that's working fine with this setup.
Mac The Knife is offline  
post #5 of 16 Old 06-06-2012, 03:16 AM
AVS Special Member
 
tux99's Avatar
 
Join Date: Jan 2005
Location: Europe
Posts: 1,523
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Liked: 11
Quote:
Originally Posted by Mac The Knife View Post

I'm running the open source radeon drivers on an integrated 4290 chipset.

Most (all?) of the MPeg2 and 4 hardware assisted decoding works.

Hi Mac, I realise this is an old thread but since I came across it I was wondering how hardware assisted decoding works with the FOSS radeon drivers.

Do the FOSS radeon drivers support vaapi (if yes with what back-end?) or how else does this work?

I'm curious because I have never heard before that the FOSS radeon drivers do hardware video decoding!

My Linux news / reviews / tips+tricks / downloads web site: http://www.linuxtech.net/
tux99 is offline  
post #6 of 16 Old 06-06-2012, 12:21 PM
AVS Special Member
 
Mac The Knife's Avatar
 
Join Date: Oct 2003
Location: Phoenix, AZ
Posts: 4,903
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Liked: 23
Quote:
Originally Posted by tux99 View Post

Hi Mac, I realise this is an old thread but since I came across it I was wondering how hardware assisted decoding works with the FOSS radeon drivers.
Do the FOSS radeon drivers support vaapi (if yes with what back-end?) or how else does this work?
I'm curious because I have never heard before that the FOSS radeon drivers do hardware video decoding!

My understanding is that the FOSS guys used the documentation that AMD released a couple of years ago to add some hardware assisted decoding directly into the FOSS driver. IOW, no they don't use vaapi. But some things, such as hardware assisted deinterlacing, weren't in the document dump, so everything that's possible in the hardware isn't available in the FOSS driver.

But I could be all wrong on this. It's hard to find any reliable source of info on the stuff.
Mac The Knife is offline  
post #7 of 16 Old 06-06-2012, 10:11 PM
AVS Special Member
 
tux99's Avatar
 
Join Date: Jan 2005
Location: Europe
Posts: 1,523
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Liked: 11
Quote:
Originally Posted by Mac The Knife View Post

My understanding is that the FOSS guys used the documentation that AMD released a couple of years ago to add some hardware assisted decoding directly into the FOSS driver. IOW, no they don't use vaapi. But some things, such as hardware assisted deinterlacing, weren't in the document dump, so everything that's possible in the hardware isn't available in the FOSS driver.
But I could be all wrong on this. It's hard to find any reliable source of info on the stuff.

I think you misunderstood my question (I guess I wasn't clear), I mean how in practice do you use this radeon hardware decoding, what software with what settings takes advantage of it?

For example if you use mplayer what parameters are you using to take advantage of the hardware decoding?

Or in Mythtv (if that's what you use) what settings do you need to enable/disable for this to work?

And how can you tell it's using hardware decoding?

My Linux news / reviews / tips+tricks / downloads web site: http://www.linuxtech.net/
tux99 is offline  
post #8 of 16 Old 06-06-2012, 11:23 PM
AVS Special Member
 
Mac The Knife's Avatar
 
Join Date: Oct 2003
Location: Phoenix, AZ
Posts: 4,903
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Liked: 23
Quote:
Originally Posted by tux99 View Post

I think you misunderstood my question (I guess I wasn't clear), I mean how in practice do you use this radeon hardware decoding, what software with what settings takes advantage of it?
For example if you use mplayer what parameters are you using to take advantage of the hardware decoding?
Or in Mythtv (if that's what you use) what settings do you need to enable/disable for this to work?
And how can you tell it's using hardware decoding?

Oh. OK.

I'm primarily an SMplayer user and in the case of SMplayer, I think the key is being able to use either the XV or XVMC [I've always used XV] driver and making sure that the combination of the version of Xorg and of the FOSS Radeon driver are sympatico so that direcRendering is enabled [as verified by the glxinfo command].

VLC can also be set to the XV driver and Totem(gstreamer) seem to use it by default at least I don't know of anyway to change it in gstreamer.

As far as knowing when it's using hardware decoding, it's based on my experience with an old PC I used to own that had a ATI9800pro and anAMD3800+ single core CPU. With that combo the CPU was pretty weak, so it had to have some GPU decoding working to handle any form of HD. Without directRendering working the CPU would max out and drop frames. When the FOSS driver dropped support for the R300 chipset in that 9800pro card (due to all the friggin changes that Xorg was doing), direct Rendering no longer worked so I was forced me to buy a new computer.

Anyway it just boils down to my experience with that system and having seen what it's like when directrendering isn't working (which I admit is pathetically full of holes, but this seems to be the only way to learn anything about the Linux video stack).
Mac The Knife is offline  
post #9 of 16 Old 06-09-2012, 12:08 AM
Advanced Member
 
CityK's Avatar
 
Join Date: Nov 2002
Posts: 869
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Liked: 13
Quote:
Originally Posted by tux99 View Post

I realise this is an old thread but since I came across it I was wondering how hardware assisted decoding works with the FOSS radeon drivers.
Do the FOSS radeon drivers support vaapi (if yes with what back-end?) or how else does this work?
I'm curious because I have never heard before that the FOSS radeon drivers do hardware video decoding!

Current Status
In a nutshell, currently only MPEG2 decoding is available via XvMC & VDPAU state trackers on Gallium3D R300 & R600. At the beginning of this year, some ground work for H.264 decode was laid down in the VDPAU state tracker, but that's unfinished work -- let me get back to this point in a second.

Video decode playback via the state trackers with gallium is done in the 3D engine via shaders, not with the dedicated UVD engine. AMD's OSS driver developers are looking into trying to get UVD support, but they may not be able to because of legal mumbo jumbo surrounding protecting the DRM (digital rights restrictitons management) ... however, that said, it is apparent that code review is (or was) undertaken in this regard. Getting back to the point about the VDPAU state tracker, apparently part of the reason why not too much further progression (or at least as far as I"m aware) with the h.264 implementation has occurred has been because of diversion of resources towards the possibility of adding UVD support. Anyway, the future will tell us more. But in the mean time, back to the story...

A Note on the Mesa Gallium3D drivers

The Gallium3D R300 driver (aka R300g) covers hardware ASICS between R300 and R500 (i.e. up through the X1xxx series) (see http://en.wikipedia.org/wiki/Comparison_of_ATI_graphics_processing_units#Radeon_R300_series for more details)

The Gallium3D R600 driver (aka R600g) covers hardware ASICS (including that in the Llano Fusion APU hardware) between R600 to NI (see the wikipedia article and work through the R600 (HD 2xxx, HD 3xxx), R700 (HD 4xxx), Evergreen series ( HD 5xxx), NI ( HD 6xxx) )

The Gallium3D RadeonSI driver -- its the new and apparently still very, very basic driver for the Southern Islands ASICS ( HD 7xxx series ...again, see the wikipedia article for some granularity of the devices this covers)

Points of Notoriety in the Development History
Oct/Nov 2010 -- XvMC state tracker with R600g arrives ... MC and iDCT supported

Apr/May 2011 -- VDPAU state tracker with R600g arrives for MPEG2 decode

June-ish 2011 -- both XvMC and VDPAU state tracker support arrives for R300g

July 2011 -- the development branch (Mesa pipe-video) where all this goodness was becoming available is merged into the master Mesa git repo ... this stuff will be introduced in Mesa 8.0 release (Jan 2012)

Nov 2011 - the 6.14.3 release of the xf86-video-ati (radeon) driver (i.e. the DDX)... adds the XvMC & VDPAU support via the Gallium3D drivers for >R300 hardware

Jan 2012 -- some H.264 "infrastructure" laid down in the VDPAU state tracker .... also sometime that month Mesa 8.0 was released.

To present -- see "Current Status" above

Umm, so how do I use these new toys? Part 1
Roughtly nine months ago I wrote:
Quote:
Which likely means -- unless you dig rolling your own kernels, MESA, DDX, blah blah blah (insert associated necessary components here) -- that out of the box acceleration won't be seen by most folks until the mid '12 distro releases.
So here we are about mid '12 and, somewhat surprisingly, that prediction actually held up pretty good I believe.

So, as it stands, you might possibly be able to use the state trackers to accelerate MPEG2 decode if, installed in your system, you have
  1. R300 or R600 class hardware
  2. v6.14.3 or greater of the radeon driver and
  3. v8.0 or greater of the Mesa drivers and
  4. Mesa was built with the respective --enable flags for vdpau and xvmc If they were, you should have (assuming a 64bit system, otherwise regular /lib directory)
    /usr/lib64/libXvMCr600.so
    /usr/lib64/libXvMCsoftpipe.so
    /usr/lib64/vdpau/libvdpau_r600.so
    /usr/lib64/vdpau/libvdpau_softpipe.so

And while some current distros meet the above listed software requirements out of the box, or their users can get updated driver packages that provide the necessities --- [ Evidently: Ubuntu has updated versions in PPA; Debian provides good to go drivers; as well as Arch and Gentoo too I believe ] ... other distro's do not ... so you'd have to roll your own in those cases. For example, Fedora (https://bugzilla.redhat.com/show_bug.cgi?id=815569) and openSUSE appear not have the build flags enabled.

Umm, so how do I use these new toys? Part 2
To test with MPlayer you could:

via XvMC
(Note: apparently the developers of MPlayer2 removed XvMC support, so you'll have to use the original 'mplayer' to test the XvMC state tracker support)

$ mplayer -vo xvmc -vc ffmpeg12mc filename.mpg

if you are wanted to watch an mpeg2 encoded TV channel (as would be the case with ATSC .... I'm not sure if any DVB-x (x=T,C,S,T2,S2...) streams are mpeg2 anymore or if h.264/avc mpeg4 rules the roost) and provided you have your channels.conf setup, you could very similarly use:

$ mplayer dvb://channelname -vo xvmc -vc ffmpeg12mc


via VDPAU
$ mplayer -vo vdpau -vc ffmpeg12vdpau filename.mpg

$ mplayer dvb://channelname -vo vdpau -vc ffmpeg12vdpau


Hopefully all of the above provides some clarity.
CityK is offline  
post #10 of 16 Old 06-09-2012, 12:05 PM
AVS Special Member
 
tux99's Avatar
 
Join Date: Jan 2005
Location: Europe
Posts: 1,523
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Liked: 11
Thanks CityK that's a very useful informative post, it's because of posts like this one and the generally friendly attitude that I love this Linux HTPC subforum on AVS!

My Linux news / reviews / tips+tricks / downloads web site: http://www.linuxtech.net/
tux99 is offline  
post #11 of 16 Old 06-11-2012, 05:44 PM
Rgb
AVS Special Member
 
Rgb's Avatar
 
Join Date: Apr 2000
Location: SE Michigan
Posts: 6,881
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 7 Post(s)
Liked: 17
Quote:
Originally Posted by tux99 View Post

Thanks CityK that's a very useful informative post, it's because of posts like this one and the generally friendly attitude that I love this Linux HTPC subforum on AVS!

Yes, thanks for keeping the "science" in AVS wink.gif
Rgb is offline  
post #12 of 16 Old 06-12-2012, 06:56 AM
Senior Member
 
kenyee's Avatar
 
Join Date: Jul 2005
Location: Boston, PRofMA
Posts: 258
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Liked: 10
FWIW, I've never been able to get sandy bridge VAAPI working w/ MythTV 0.25. Another person on the MythTV users mailing list had the same issue. You have to configure a new playback setting and choose VAAPI because it's not enabled by default. MythTV bombs out and crashes when you try using it frown.gif
Luckily a 2500K is fast enough to brute force video playback :-)
kenyee is offline  
post #13 of 16 Old 06-25-2012, 11:46 AM
AVS Special Member
 
Mac The Knife's Avatar
 
Join Date: Oct 2003
Location: Phoenix, AZ
Posts: 4,903
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Liked: 23
Ran across this on Gnome Planet (here's the original blog):



The Linux Graphics Stack
Posted on June 16, 2012 by Jasper St. Pierre

This is an introductory overview post for the Linux Graphics Stack, and how it currently all fits together. I initially wrote it for myself after having conversations with people like Owen Taylor, Ray Strode and Adam Jackson about this stack. I had to go back to them every month or so and learn the stuff from the ground up all over again, as I had forgotten every single piece. I asked them for a good high-level overview document so I could stop bothering them. They didn’t know of any. I started this one. It has been reviewed by Adam Jackson and David Airlie, both of whom work on this exact stack.

Also, I want to point out that a large amount of this stack applies only to the free software drivers. That means that a lot of what you read here may not apply for the AMD Catalyst and NVidia proprietary drivers. They may have their own implementations of OpenGL, or an internal fork of mesa. I’m describing the stack that comes with the free radeon, nouveau and Intel drivers.

If you have any questions, or if at any point things were unclear, or if I’m horribly, horribly wrong about something, or if I just botched a sentence so that it’s incomprehensible, please ask me or let me know in the comments section below.

To start us off, I’m going to paste the entire big stack right here, to let you get a broad overview of where every piece fits in the stack. If you do not understand this right away, don’t be scared. Feel free to refer to this throughout the post. Here’s a handy link.

So, to be precise, there are two different paths, depending on the type of rendering you’re doing.

3D rendering with OpenGL

1. Your program starts up, using “OpenGL” to draw.
2. A library, “mesa”, implements the OpenGL API. It uses card-specific drivers to translate the API into a hardware-specific form. If the driver uses Gallium internally, there’s a shared component that turns the OpenGL API into a common intermediate representation, TGSI. The API is passed through Gallium, and all the device-specific driver does is translate from TGSI into hardware commands,
3. libdrm uses special secret card-specific ioctls to talk to the Linux kernel
4. The Linux kernel, having special permissions, can allocate memory on and for the card.
5. Back out at the mesa level, mesa uses DRI2 to talk to Xorg to make sure that buffer flips and window positions, etc. are synchronized.

2D rendering with cairo

* Your program starts up, using cairo to draw.
* You draw some circles using a gradient. cairo decomposes the circles into trapezoids, and sends these trapezoids and gradients to the X server using the XRender extension. In the case where the X server doesn’t support the XRender extension, cairo draws locally using libpixman, and uses some other method to send the rendered pixmap to the X server.
* The X server acknowledges the XRender request. Xorg can use multiple specialized drivers to the drawing.
1. In a software fallback case, or in the case the graphics driver isn’t up to the task, Xorg will use pixman to do the actual drawing, similar to how cairo does it in its case.
2. In a hardware-accelerated case, the Xorg driver will speak libdrm to the kernel, and send textures and commands to the card in the same way.

As to how Xorg gets things on the screen, Xorg itself will set up a framebuffer to draw into using KMS and card-specific drivers.
X Window System, X11, Xlib, Xorg

X11 isn’t just related to graphics; it has an event delivery system, the concept of properties attached to windows, and more. Lots of other non-graphical things are built on top of it (clipboard, drag and drop). Only listed in here for completeness, and as an introduction. I’ll try and post about the entire X Window System, X11 and all its strange design decisions later.

X11
The wire protocol used by the X Window System
Xlib
The reference implementation of the client side of the system, and a host of tons of other utilities to manage windows on the X Window System. Used by toolkits with support for X, like GTK+ and Qt. Vanishingly rare to see in applications today.
XCB
Sometimes quoted as an alternative to Xlib. It implements a large amount of the X11 protocol. Its API is at a much lower level than Xlib, such that Xlib is actually built on XCB nowadays. Only mentioning it because it’s another acronym.
Xorg
The reference implementation of the server side of the system.

As I’ll try to be very careful to label something correctly. If I say “the X server”, I’m talking about a generic X server: it may be Xorg, it may be Apple’s X server implementation, it may be Kdrive. We don’t know. If I say “X11″, or the “X Window System”, I’m talking about the design of the protocol or system overall. If I say “Xorg”, that means that it’s an implementation detail of Xorg, which is the most used X server, and may not apply to any other X servers at all. If I ever say “X” by itself, it’s a bug.

X11, the protocol, was designed to be extensible, which means that support for new features can be added without creating a new protocol and breaking existing old clients. As an example, xeyes and oclock get their fancy shapes because of the Shape Extension, which provide support for non-rectangular windows. If you’re curious how this magic functionality can just appear out of nowhere, the answer is that it doesn’t: support for the extension has to be added to both the server and client before it can be used. There is functionality in the core protocol itself so that clients can ask the server what extensions it has support for, so it knows what functionality it can or cannot use.

X11 was also designed to be “network transparent”. Most importantly it means that we cannot rely on the X server and any X client being on the same machine, so any talking between the two must go over the network. In actuality, a modern desktop environment does not work out of the box in this scenario, as a lot of inter-process communication goes through systems other than X11, like DBus. Over the network, it’s quite chatty, and results in a lot of traffic. When the server and client are on the same machine, instead of going over the network, they’ll go over a UNIX socket, and the kernel doesn’t have to copy data around.

We’ll get back to the X Window System and its numerous extensions in a little bit.
cairo

cairo is a drawing library used either by applications like Firefox directly, or through libraries like GTK+, to draw vector shapes. GTK+3′s drawing model is built entirely on cairo. If you’ve ever used an HTML5 , cairo implements pratically the same API. Although was originally developed by Apple, the vector drawing model is well-known, as it’s the PostScript vector drawing model, which has found support in other vector graphics technologies and standards such as PDF, Flash, SVG, Direct2D, Quartz 2D, OpenVG, and lots more than I can possibly give an exhaustive list for.

cairo has support to draw to X11 surfaces through the Xlib backend.

cairo has been used in toolkits like GTK+. Functionality was added in GTK+ 2 to optionally use cairo in GTK+ 2.8. The drawing model in GTK+3 requires cairo.
XRender Extension

X11 has a special extension, XRender, which adds support for anti-aliased drawing primitives (X11′s existing graphics were aliased), gradients, matrix transforms and more. The original intention was that drivers could have specialized accelerated code paths for doing specific drawing. Unfortunately, it seems to be the case that software rasterization is just as fast, for unintuitive reasons. Oh well. XRender deals in aligned trapezoids – rectangles with an optional slant to the left and right edges. Carl Worth and Keith Packard came up with a “fast” software rasterization method for trapezoids. Trapezoids are also super easy to decompose into two triangles, to align ourselves with fast hardware rendering. Cairo includes a wonderful show-traps utility that might give you a bit of insight into the tesselation into trapezoids it does.

Here’s a simple red circle we drew. This is decomposed into two sets of trapezoids – one for the stroke, and one for the fill. Since the diagrams that show-traps gives you by default aren’t very enlightening, I hacked up the utility to give each trapezoid a unique color. Here’s the set of trapezoids for the black stroke.

Psychedelic.
pixman

Both the X server and cairo need to do pixel level manipulation at some point. cairo and Xorg had separate implementations of things like basic rasterization algorithms, pixel-level access for certain kinds of buffers (ARGB32, RGB24, RGB565), gradients, matrices, and a lot more. Now, both the X server and cairo share a low-level library called pixman, which handles these things for it. pixman is not supposed to be a public API, nor is it a drawing API. It’s not really any API at all; it’s just a solution to some code duplication between various parts.
OpenGL, mesa, gallium

Now comes the fun part: modern hardware acceleration. I assume everybody already knows what OpenGL is. It’s not a library, there will never be one set of sources to a libGL.so. Each vendor is supposed to provide its own libGL.so. nVidia provides its own implementation of OpenGL and ships its own libGL.so, based on its implementations for Windows and OS X.

If you are running open-source drivers, your libGL.so implementation probably comes from mesa. mesa is many things, but one of the major things it provides that it is most famous for is its OpenGL implementation. It is an open-source implementation of the OpenGL API. mesa itself has multiple backends for which it provides support. It has three CPU-based implementations: swrast (outdated and old, do not use it), softpipe (slow), llvmpipe (potentially fast). mesa also has hardware-specific drivers. Intel supports mesa and has built a number of drivers for their chipsets which are shipped inside mesa. The radeon and nouveau drivers are also supported in mesa, but are built on a different architecture: gallium.

gallium isn’t really anything magical. It’s a set of components to make implementing drivers a lot easier. The idea is that there are state trackers that implement some form of API (OpenGL, GLSL, Direct3D), transform that state to an intermediate representation (known as Tungsten Graphics Shader Infrastructure, or TGSI), and then the backends take that intermediate representation and convert it to the operations that would be consumed by the hardware itself.

Sadly, the Intel drivers don’t use Gallium. My coworkers tell me it’s because the Intel driver developers do not like having a layer between Mesa and their driver.
A quick aside: more acronyms

Since there’s a lot of confusing acronyms to cover, and I don’t want to give them each an H3 and write a paragraph for each, I’m just going to list them here. Most of them don’t really matter in today’s world, they’re just an easy reference so you don’t get confused.

GLES
OpenGL has several profiles for different form factors. GLES is one of them, standing for “GL Embedded System” or “GL Embedded Subset”, depending on who you ask. It’s the latest approach to target the embedded market. The iPhone supports GLES 2.0.
GLX
OpenGL has no concept of platform and window systems on its own. Thus, bindings are needed to translate between the differences of OpenGL and something like X11. For instance, putting an OpenGL scene inside an X11 window. GLX is this glue.
WGL
See above, but replace “X11″ with “Windows”, that is, that one Microsoft Operating System.
EGL
EGL and GLES are often confused. EGL is a new platform-agnostic API, developed by Khronos Group (the same group that develops and standardizes OpenGL) that provides facilities to get an OpenGL scene up and running on a platform. Like OpenGL, this is vendor-implemented; it’s an alternative to bindings like WGL/GLX and not just a library on top of them, like GLUT.
fglrx
fglrx is former name for AMD’s proprietary Xorg OpenGL driver, now known as “Catalyst”. It stands for “FireGL and Radeon for X”. Since it is a proprietary driver, it has its own implemnetation of libGL.so. I do not know if it is based on mesa. I’m only mentioning it because it’s sometimes confused with generic technology like AIGLX or GLX, due to the appearance of the letters “GL” and “X”.
DIX, DDX
The X graphics parts of Xorg is made up of two major parts, DIX, the “Driver Independent X” subsystem, and DDX, the “Driver Dependent X” subsystem. When we talk about an Xorg driver, the more technically accurate term is a DDX driver.

Xorg Drivers, DRM, DRI

A little while back I mentioned that Xorg has the ability to do accelerated rendering, based on specific pieces of hardware. I’ll also say that this is not implemented by translating from X11 drawing commands into OpenGL calls. If the drivers are implemented in mesa land, how can this work without making Xorg dependent on mesa?

The answer was to create a new piece of infrastructure to be shared between mesa and Xorg. mesa implements the OpenGL parts, Xorg implements the X11 drawing parts, and they both convert to a set of card-specific commands. These commands are then uploaded to the kernel, using something called the “Direct Rendering Manager”, or DRM. libdrm uses a set of generic, private ioctls with the kernel to allocate things on the card, and stuffs the commands and textures and things it needs in there. This ioctl interface comes in two forms: Intel’s GEM, and Tungsten Graphics’s TTM. There is no good distinction between them; they both do the same thing, they’re just different competing implementation. Historically, GEM was designed and proudly announced to be a simpler alternative to TTM, but over time, it has quietly grown to about the same complexity as TTM. Welp.

This means that when you run something like glxgears, it loads mesa. mesa itself loads libdrm, and that talks to the kernel driver directly using GEM/TTM. Yes, glxgears talks to the kernel driver directly to show you some spinning gears, and sternly remind you of the benchmarking contents of said utility.

If you poke in ls /usr/lib64/libdrm_*, you’ll note that there are hardware-specific drivers. For cases when GEM/TTM aren’t enough, the mesa and X server drivers will have a set of private ioctls to talk to the kernel, which are encapsulated in here. libdrm itself doesn’t actually load these.

The X server needs to know what’s happening here, though, so it can do things like synchronization. This synchronization between your glxgears, the kernel, and the X server is called DRI, or more accurately, DRI2. “DRI” stands for “Direct Rendering Infrastructure”, but it’s sort of a strange acronym. “DRI” refers to both the project that glued mesa and Xorg together (introducing DRM and a bunch of the things I talk about in this article), as well as the DRI protocol and library. DRI 1 wasn’t really that good, so we threw it out and replaced it with DRI 2.
KMS

As a sort of aside, let’s say you’re working on a new X server, or maybe you want to show graphics on a VT without using an X server. How do you do it? You have to configure the actual hardware to be able to put up graphics. Inside of libdrm and the kernel, there’s a special subsystem that does exactly that, called KMS, which stands for “Kernel Mode Setting”. Again, through a set of ioctls, it’s possible to set up a graphics mode, map a framebuffer, and so on, to display directly on a TTY. Before, there were (and still are) hardware-specific ioctls, so a shared library called libkms was created to give a shared API. Eventually, there was a new API at the kernel level, literally called the “dumb ioctls”. With the new dumb ioctls in place, it is recommended to use those and not libkms.

While it’s very low-level, it’s entirely possible to do. Plymouth, the boot splash screen integrated into modern distributions, is a good example of a very simple application that does this to set up graphics without relying on an X server.
The “Expose” model, Redirection, TFP, Compositing, AIGLX

I’ve used the term “compositing window manager” without really describing what it means to composite, or what a window manager does. You see, back in the 80s when the X Window System was designed on UNIX systems, a lot of random other companies like HP, Digital Equipment Corp., Sun Microsystems, SGI, were also developing products based on the X Window System. X11 intentionally didn’t mandate any basic policy on how windows were to be controlled, and delegated that responsibility to a separate process called the “window manager”.

As an example, the popular environment at the time, CDE, had a system called “focus follows mouse”, which focused windows when the user moved their mouse over the window itself. This is different from the more popular model that Windows and Mac OS X use by default, “click to focus”.

As window managers started to become more and more complex, documents started to appear describing interoperability between the many different environments. It, too, wouldn’t really mandate any policy either like “Click to Focus” either.

Additionally, back in the 80s, many systems did not have much memory. They could not store the entire pixel contents of all window contents. Windows and X11 solve this issue in the same way: each X11 window should be lossy. Namely, a program will be notified that a window has been “exposed”.

Imagine a set of windows like this. Now let’s say the user drags GIMP away:

The area in the dark gray has been exposed. An ExposeEvent will get sent to the program that owns the window, and it will have to redraw contents. This is why hung programs in versions of Windows or Linux would go blank after you dragged a window over them. Consider the fact that in Windows, the desktop itself is just another program without any special privileges, and can hang like all the others, and you have one hell of a bug report.

So, now that machines have as much memory as they do, we have the opportunity to make X11 windows lossless, by a mechanism called redirection. When we redirect a window, the X server will create backing pixmaps for each window, instead of drawing directly to the backbuffer. This means that the window will be hidden entirely. Something else has to take the opportunity to display the pixels in the memory buffer.

The composite extension allows a compositing window manager, or “compositor”, to set up something called the Composite Overlay Window, or COW. The compositor owns the COW, and can paint to it. When you run Compiz or GNOME Shell, these programs are using OpenGL to display the redirected windows on the screen. The X server can give them the window contents by a GL extension, “Texture from Pixmap“, or TFP. It lets an OpenGL program use an X11 Pixmap as if it were an OpenGL Texture.

Compositing window managers don’t have to use TFP or OpenGL, per se, it’s just the easiest way to do so. They could, if they wanted to, draw the window pixmaps onto the COW normally. I’ve been told that kwin4 uses Qt directly to composite windows.

Compositing window managers grab the pixmap from the X server using TFP, and then render it on to the OpenGL scene at just the right place, giving the illusion that the thing that you think you’re clicking on is the actual X11 window. It may sound silly to describe this as an illusion, but play around with GNOME Shell and adjust the size or position of the window actors (Enter global.get_window_actors().forEach(function(w) { w.scale_x = w.scale_y = 0.5; }) in the looking glass), and you’ll quickly see the illusion break down in front of you, as you realize that when you click, you’re poking straight through the video player, and to the actual window underneath (Change 0.5 to 1.0 in the above snippet to revert the behavior).

With this information, let’s explain one more acronym: AIGLX. AIGLX stands for “Accelerated Indirect GLX”. As X11 is a networked protocol, this means that OpenGL should have to work over the network. When OpenGL is being used over the network, this is called an “indirect context”, versus a “direct context” where things are on the same machine. The network protocol used in an “indirect context’ is fairly incomplete and unstable.

To understand the design decision behind AIGLX, you have to understand the problem that it was trying to solve: making compositing window managers like Compiz fast. While nvidia’s proprietary driver had kernel-level memory manangement through a custom interface, the open-source stack at this point hadn’t achieved that yet. Pulling a window texture from the X server into the graphics hardware would have meant that it would have to be copied every time the window was updated. Slow. As such, AIGLX was a temporary hack to implement OpenGL in software, preventing the copy into hardware acceleration. As the scene that compositors like Compiz used wasn’t very complex, it worked well enough.

Despite all the fanfare and Phoronix articles, AIGLX hasn’t been used realistically for a while, as we now have the entire DRI stack which can be used to implement TFP without a copy.

As you can imagine, copying (or more accurately, sampling) the window texture contents so that it can be painted as an OpenGL texture requires copying data. As such, most window managers have a feature to turn redirection off for a window that’s full-screen. It may sound a bit silly to describe this as unredirection, as that’s the initial state a window is in. But while it may be the initial state for a window, with our modern Linux desktops, it’s hardly the usual state. The logic here is that if a window would be covering the COW anyway and compositing features aren’t necessary, it can safely be unredirected. This feature is designed to give high performance to programs like games, which need to run with high performance at 60 frames per second.
Wayland

As you can see, we’ve split out quite a large bit of the original infrastructure from X’s initial monolithic behavior. This isn’t the only place where we’ve tore down the monolithic parts of X: a lot of the input device handling has moved into the kernel with evdev, and things like device hotplug support has been moved back into udev.

A large reason that the X Window System stuck around for now was just because it was a lot of effort to replace it. With Xorg stripped down from what it initially was, and with a large amount of functionality required for a modern desktop environment provided solely by extensions, well, how can I say this, X is overdue.

Enter Wayland. Wayland reuses a lot of existing infrastructure that we’ve built up. One of the most controversial things about it is that it lacks any sort of network transparency or drawing protocol. X’s network transparency falls flat in modern times. A large amount of Linux features are hosted in places like DBus, not X, and it’s a shame to see things like Drag and Drop and Clipboard support being by and large hacks with the X Window System solely for network support.

Wayland can use almost the entire stack as detailed above to get a framebuffer on your monitor up and running. Wayland still has a protocol, but it’s based around UNIX sockets and local resources. The biggest drastic change is that there is no /usr/bin/wayland binary running like there is a /usr/bin/Xorg. Instead, Wayland follows the modern desktop’s advice and moves all of this into the window manager process instead. These window managers, more accurately called “compositors” in Wayland terms, are actually in charge of pulling events from the kernel with a system like evdev, setting up a frame buffer using KMS and DRM, and displaying windows on the screen with whatever drawing stack they want, including OpenGL. While this may sound like a lot of code, since these subsystems have moved elsewhere, code to do all of these things would probably be on the order of 2000-3000 SLOC. Consider that the portion of mutter just to implement a sane window focus and stacking policy and synchronize it with the X server is around 4000-5000 SLOC, and maybe you’ll understand my fascination a bit more.

While Wayland does have a library that implementations of both clients and compositors probably should use, it’s simply a reference implementation of the specified Wayland protocol. Somebody could write a Wayland compositor entirely in Python or Ruby and implement the protocol in pure Python, without the help of a libwayland.

Wayland clients talk to the compositor and request a buffer. The compositor will hand them back a buffer that they can draw into, using OpenGL, cairo, or whatever. The compositor is at the discretion to do whatever it wants with that buffer – display it normally because it’s awesome, set it on fire because the application is being annoying, or spin it on a cube because we need more YouTube videos of Linux cube spinning.

The compositor is also in control of input and event handling. If you tried out the thing with setting the scale of the windows in GNOME Shell above, you may have been confused at first, and then figured out that your mouse corresponded to the untransformed window. This is because we weren’t *actually* affecting the X11 window itself, just changing how it gets displayed. The X server keeps track of where windows are, and it’s up to the compositing window manager to display them where the X server thinks it is, otherwise, confusion happens.

Since a Wayland compositor is in charge of reading from evdev and giving windows events, it probably has a much better idea of where a window is, and can do the transformations internally, meaning that not only can we spin windows on a cube temporarily, we will be able to interact with windows on a cube.
Summary

I still hear a lot today that Xorg is very monolithic in its implementation. While this is very true, it’s less true than it was a long while ago. This isn’t due to the incompetence on the part of Xorg developers, a large amount of this is due to baggage that we just have to support, like the hardware-accelerated XRender protocol, or going back even further, non-anti-aliased drawing commands like XPolyFill. While it’s very apparent that X is going to go away in favor of Wayland some time soon, I want to make it clear that a lot of this is happening with the acknowledgement and help of the Xorg and desktop developers. They’re not stubborn, and they’re not incompetent. Hell, for dealing with and implementing a 30-year-old protocol plus history, they’re doing an excellent job, especially with the new architecture.

I also want to give a big shout-out to everybody who worked on the stuff I mentioned in this article, and also want to give my personal thanks to Owen Taylor, Ray Strode and Adam Jackson for being extremely patient and answering all my dumb questions, and to Dave Airlie and Adam Jackson for helping me technically review this article.

While I went into some level of detail in each of these pieces, there’s a lot more here where you could study in a lot more detail if it interests you. To pick just a few examples, you could study the the geometry algorithms and theories that cairo exploits to convert arbitrary shapes to trapezoids. Or maybe the fast software rendering algorithm for trapezoids by Carl Worth and Keith Packard and investigate why it’s fast. Consider looking at the design of DRI2, and how it differs from DRI1. Or maybe you’re interested in the hardware itself, and looking at graphics card architecture, and looking at the data sheets to see how you would program one. And if you want to help out in any of these areas, I assume all the projects listed above would be more than happy to have contributions.

I’m planning on writing more of these in the future. A large amount of the stacks used in the Linux and GNOME community today don’t have a good overview document detailing them from a very high level.
Version History

* Published
* Converted CSS-based diagrams to images to make it easier for Planet GNOME readers
* Fixed Looking Glass example line
* Fixed TFP link
* Modified the tone of the policy of window managers section to better convey the point
* Corrected information about GTK+ and cairo integration (GTK+ 2.8, not “near the end of the cycle”)
* Fixed acronym expansion for “fglrx”

This entry was posted in Uncategorized by Jasper St. Pierre. Bookmark the permalink.
Mac The Knife is offline  
post #14 of 16 Old 06-25-2012, 11:54 AM
AVS Special Member
 
Mac The Knife's Avatar
 
Join Date: Oct 2003
Location: Phoenix, AZ
Posts: 4,903
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Liked: 23
Mac The Knife is offline  
post #15 of 16 Old 06-25-2012, 12:20 PM
AVS Special Member
 
tux99's Avatar
 
Join Date: Jan 2005
Location: Europe
Posts: 1,523
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Liked: 11
Quote:
Originally Posted by Mac The Knife View Post

And more on Catalyst and XvBA:
XBMC's Thoughts On XvBA: AMD Catalyst Has Problems

Yes, I already posted this in the other thread... rolleyes.gifwink.gif
http://www.avsforum.com/t/1417224/encouraging-experience-with-xmbcbuntu-on-ati#post_22161833

My Linux news / reviews / tips+tricks / downloads web site: http://www.linuxtech.net/
tux99 is offline  
post #16 of 16 Old 09-27-2012, 07:36 PM
Advanced Member
 
CityK's Avatar
 
Join Date: Nov 2002
Posts: 869
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Liked: 13
Quote:
Originally Posted by CityK View Post

Current Status
In a nutshell, currently only MPEG2 decoding is available via XvMC & VDPAU state trackers on Gallium3D R300 & R600. At the beginning of this year, some ground work for H.264 decode was laid down in the VDPAU state tracker, but that's unfinished work -- let me get back to this point in a second.
Video decode playback via the state trackers with gallium is done in the 3D engine via shaders, not with the dedicated UVD engine. AMD's OSS driver developers are looking into trying to get UVD support, but they may not be able to because of legal mumbo jumbo surrounding protecting the DRM (digital rights restrictitons management) ... however, that said, it is apparent that code review is (or was) undertaken in this regard. Getting back to the point about the VDPAU state tracker, apparently part of the reason why not too much further progression (or at least as far as I"m aware) with the h.264 implementation has occurred has been because of diversion of resources towards the possibility of adding UVD support. Anyway, the future will tell us more.
The future has indeed revealed a little more information. In the on running thread "Bridgeman ...." in the Phoronix forums, there are a few interesting bits to be found:
- Tim Writer's (the guy taking much of Bridgeman's former position) comment on XvBA
- and Christian König's commentary about both the state tracker h.264 implementation and UVD support

So, disappointing on one level and pretty much a case of same as it ever was on a couple of others.... until it isn't.
CityK is offline  
Reply HTPC - Linux Chat

User Tag List

Thread Tools
Show Printable Version Show Printable Version
Email this Page Email this Page


Forum Jump: 

Posting Rules  
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off