<![CDATA[Gizmodo: parallel processing]]> http://tags.gizmodo.com/assets/base/img/thumbs140x140/gizmodo.com.png <![CDATA[Gizmodo: parallel processing]]> http://gizmodo.com/tag/parallelprocessing http://gizmodo.com/tag/parallelprocessing <![CDATA[Giz Explains: Snow Leopard's Grand Central Dispatch]]> You've probably heard about this snow kitty operating system for Macintosh computers. What you might not've heard is exactly how it's supposed to be unleashing the power of all those processor cores crammed inside your computer.

The heart of the matter is that the trick to actually utilizing the full power of multiple processors—or multiple cores within a processor, like the Core 2 Duo you've probably got in your computer if you bought in the last two years—is processing things in parallel. That is, doing lots of stuff side by side. After all, you've got 2, maybe 4 or even 8 processors at your disposal, so to use them as efficiently as possible, you want to pull a problem apart and throw a piece of it at each core, or at least send different problems to different cores. Sounds logical, right? Easy, even.

The rub is that writing software that can actually take advantage of all of that parallel processing at an application level isn't easy, and without software built for it, all that power is wasted. In fact, cracking the nut of parallel processing is one the major movements in tech right now, since parallelism, while it's been around forever, has been the domain of solving really big problems, not running Excel sheets on your laptop. It's why, for instance, former Intel chair Craig Barrett told me at CES that Intel hires more software engineers than hardware engineers—to push the software paradigm shift that's gotta happen.

A big part of the reason parallel programming is hard for programmers to wrestle with is simply most of them have never spent any time thinking about parallelism, says James Reinders, Intel's Chief Software Evangelist, who's spent decades working with parallel processing. In the single core world, more speed primarily came from a faster clock speed—all muscle. Multi-core is a different approach. Typically, the way a developer takes advantage of parallelism is by breaking their application down into threads, sub-tasks within a process that run simultaneously or in parallel. And processes are just instances of an application—the things you can see running on your machine by firing up the Task Manager in Windows, or Activity Monitor in OS X. On a multi-core system, different threads can be handled by different processors so multiple threads can be run at once. An app can a lot run faster if it was written to be multi-threaded.

One of the reasons parallel programming is tricky is that some kinds of processes are really hard to do in parallel—they have to be done sequentially. That is, one step in the program is dependent on the result from a previous step, so you can't really run those steps in parallel. And developers tend to run into problems, like a race condition, where two processes try to do something with the same piece of data and the order of events gets screwed up, resulting in a crash.

Snow Leopard's Grand Central Dispatch promises to take a lot of the headache out of parallel programming by managing everything at the OS level, using a system of blocks and queues, so developers don't even have to thread their apps in the traditional way. In the GCD system, a developer tags self-contained units of work as blocks, which are scheduled for execution and placed in a GCD queue. Queues are how GCD manages tasks running parallel and what order they run in, scheduling blocks to run when threads are free to run something.

Reinders says he's "not convinced that parallel programming is harder, it's just different." Still, he's a "big fan of what Apple's doing with Grand Central Dispatch" because "they've made a very approachable, simple interface for developers to take advantage of the fact that Snow Leopard can run things in parallel and they're encouraging apps to take advantage of that."

How Snow Leopard handles parallelism with GCD is a little different than what Intel's doing however—you might recall Intel just picked up RapidMind, a company that specializes in optimizing applications for parallelism. The difference between these two, at a broad level, represent two of the major approaches to parallelism—task parallelism, like GCD, or data parallelism, like RapidMind. Reinders explained it like this: If you had a million newspapers you want to cut clips out of, GCD would look at cutting from each newspaper as a task, whereas RapidMind's approach would look at it as one cutting to be executed in a repetitive manner. For some applications, RapidMind's approach will work better, and for some, GCD's task-based approach will work better. In particular, Reinders says something like GCD works best when a developer can "figure out what the fairly separate things to do are and you don't care where they run or in what order they run" within their app.

It's also a bit different from Windows' approach to parallelism, which is app oriented, rather than managing things at the OS level, so it essentially leaves everything up to the apps—apps have got to manage their own threads, make sure they're not eating all of your resources. Which for now, isn't much of a headache, but Reinders says that there is a "valid concern on Windows that a mixture of parallel apps won't cooperate with each other as much," so you could wind up with a situation where say, four apps try to use all 16 cores in your machine, when you'd rather they split up, with say one app using eight cores, another using four, and so on. GCD addresses that problem at the system level, so there's more coordination between apps, which may make it slightly more responsive to the user, if it manages tasks correctly.

You might think that the whole parallelism thing is a bit overblown—I mean, who needs a multicore computer to run Microsoft Word, right? Well, even Word benefits from parallelism Reinders told me. For instance, when you spool off something to the printer and it doesn't freeze, like it used to back in the day. Or spelling and grammar running as you type—it's a separate thread that's run in parallel. If it wasn't, it'd make for a miserable-ass typing experience, or you'd just have to wait until you were totally finished with a document. There's also the general march of software, since we love to have more features all the time: Reinders says his computer might be 100X faster than it was 15 years ago, but applications don't run 100x faster—they've got new features that are constantly added on to make them more powerful or nicer to use. Stuff like pretty graphics, animation and font scaling. In the future, exploiting multiple cores through parallelism that might be stuff like eyeball tracking, or actually good speech recognition.

Reinders actually thinks that the opportunities for parallelism are limitless. "Not having an idea to use parallelism in some cases I sometimes refer to as a 'lack of imagination,'" because someone simply hasn't thought of it, the same way people back in the day thought computers for home use would be glorified electronic cookbooks—they lacked the imagination to predict things like the web. But as programmers move into parallelism, Reinders has "great expectations they're going to imagine things the rest of us," so we could see some amazing things come out of parallelism. But whether that's next week or five years now, well, we'll see.

[Back to our Complete Guide to Snow Leopard]

Still something you wanna know? Send questions about parallel processing, parallel lines or parallel universes to tips@gizmodo.com, with "Giz Explains" in the subject line.

Grand Central Terminal main concourse image from Wikimedia Commons

]]>
http://gizmodo.com/index.php?op=postcommentfeed&postId=5346616&view=rss&microfeed=true
<![CDATA[Giz Explains: GPGPU Computing, and Why It'll Melt Your Face Off]]> No, I didn't stutter: GPGPU—general-purpose computing on graphics processor units—is what's going to bring hot screaming gaming GPUs to the mainstream, with Windows 7 and Snow Leopard. Finally, everbody's face melts! Here's how.

What a Difference a Letter Makes
GPU sounds—and looks—a lot like CPU, but they're pretty different, and not just 'cause dedicated GPUs like the Radeon HD 4870 here can be massive. GPU stands for graphics processing unit, while CPU stands for central processing unit. Spelled out, you can already see the big differences between the two, but it takes some experts from Nvidia and AMD/ATI to get to the heart of what makes them so distinct.

Traditionally, a GPU does basically one thing, speed up the processing of image data that you end up seeing on your screen. As AMD Stream Computing Director Patricia Harrell told me, they're essentially chains of special purpose hardware designed to accelerate each stage of the geometry pipeline, the process of matching image data or a computer model to the pixels on your screen.

GPUs have a pretty long history—you could go all the way back to the Commodore Amiga, if you wanted to—but we're going to stick to the fairly present. That is, the last 10 years, when Nvidia's Sanford Russell says GPUs starting adding cores to distribute the workload across multiple cores. See, graphics calculations—the calculations needed to figure out what pixels to display your screen as you snipe someone's head off in Team Fortress 2—are particularly suited to being handled in parallel.

An example Nvidia's Russell gave to think about the difference between a traditional CPU and a GPU is this: If you were looking for a word in a book, and handed the task to a CPU, it would start at page 1 and read it all the way to the end, because it's a "serial" processor. It would be fast, but would take time because it has to go in order. A GPU, which is a "parallel" processor, "would tear [the book] into a thousand pieces" and read it all at the same time. Even if each individual word is read more slowly, the book may be read in its entirety quicker, because words are read simultaneously.

All those cores in a GPU—800 stream processors in ATI's Radeon 4870—make it really good at performing the same calculation over and over on a whole bunch of data. (Hence a common GPU spec is flops, or floating point operations per second, measured in current hardware in terms of gigaflops and teraflops.) The general-purpose CPU is better at some stuff though, as AMD's Harrell said: general programming, accessing memory randomly, executing steps in order, everyday stuff. It's true, though, that CPUs are sprouting cores, looking more and more like GPUs in some respects, as retiring Intel Chairman Craig Barrett told me.

Explosions Are Cool, But Where's the General Part?
Okay, so the thing about parallel processing—using tons of cores to break stuff up and crunch it all at once—is that applications have to be programmed to take advantage of it. It's not easy, which is why Intel at this point hires more software engineers than hardware ones. So even if the hardware's there, you still need the software to get there, and it's a whole different kind of programming.

Which brings us to OpenCL (Open Computing Language) and, to a lesser extent, CUDA. They're frameworks that make it way easier to use graphics cards for kinds of computing that aren't related to making zombie guts fly in Left 4 Dead. OpenCL is the "open standard for parallel programming of heterogeneous systems" standardized by the Khronos Group—AMD, Apple, IBM, Intel, Nvidia, Samsung and a bunch of others are involved, so it's pretty much an industry-wide thing. In semi-English, it's a cross-platform standard for parallel programming across different kinds of hardware—using both CPU and GPU—that anyone can use for free. CUDA is Nvidia's own architecture for parallel programming on its graphics cards.

OpenCL is a big part of Snow Leopard. Windows 7 will use some graphics card acceleration too (though we're really looking forward to DirectX 11). So graphics card acceleration is going to be a big part of future OSes.

So Uh, What's It Going to Do for Me?
Parallel processing is pretty great for scientists. But what about those regular people? Does it make their stuff go faster. Not everything, and to start, it's not going too far from graphics, since that's still the easiest to parallelize. But converting, decoding and creating videos—stuff you're probably using now more than you did a couple years ago—will improve dramatically soon. Say bye-bye 20-minute renders. Ditto for image editing; there'll be less waiting for effects to propagate with giant images (Photoshop CS4 already uses GPU acceleration). In gaming, beyond straight-up graphical improvements, physics engines can get more complicated and realistic.

If you're just Twittering or checking email, no, GPGPU computing is not going to melt your stone-cold face. But anyone with anything cool on their computer is going to feel the melt eventually.

[Back to our Complete Guide to Snow Leopard]

]]>
http://gizmodo.com/index.php?op=postcommentfeed&postId=5252545&view=rss&microfeed=true
<![CDATA[Why Windows 7 Is Snappier Than Vista]]> Most people will tell you that Windows 7 is snappier than Vista, even though the raw numbers say otherwise. But it's not in your head. Windows 7 is more responsive than Vista. Here's why.

I meant to post this a few days ago, but it fits in really nicely with our benchmark testing to explain what's going on under Windows 7's hood. Microsoft obviously focused a lot on the user experience in Windows 7, so a lot of work went into improving desktop responsiveness—smoothing out the little snags or hangs up that made people feel like Vista was too slow. Which is apparently a hard thing to do, since a million different things can cause slowdown. But the most frequent cause of hangups is a bottleneck caused by one graphics device interface application—an app that taps your graphics card—waiting on another GDI app that's being all slow and crappy.

In Vista, this could happen because the way the GDI was designed, a single app could hold a system-wide global lock, so apps running simultaneously constantly jockey for the lock in order to render on the screen, and if one asshole app doesn't let go, it screws every other app waiting in line. So Microsoft re-designed the way this stuff is orchestrated, so multiple apps can "reliably" render at the same time, meaning less bottlenecks. Besides improving reliability, the redesign actually improved performance with multiple GDI apps running simultaneously on multi-core processors, so you'll see real benefits from going multi-core, which no doubt makes Intel's Craig Barrett happy.

Oh yes, they also reduced the memory footprint, but anybody running Windows 7 already noticed this. So yes, Windows 7 really is more responsive, even if run-of-the-mill benchmarks can't exactly measure how that is. [Engineering Windows 7]

]]>
http://gizmodo.com/index.php?op=postcommentfeed&postId=5234169&view=rss&microfeed=true
<![CDATA[Nvidia Quadro FX 5800 Claims Most Powerful Graphics Card Ever, Probably Handles Crysis OK]]> Nvidia has released what it describes as "the most powerful professional graphics card in graphics history"—the Quadro FX 5800, which packs up to 240 of Nvidia's CUDA independent graphics cores for shouldering some of the load normally handled by the main processor as well as 4GB of graphics memory, another claimed first. The 5800 is intended mostly for scientific and medical visualizations, as well as crazy complex 3D rendering. One might imagine it would also play most of your video games at a decent FPS. Price? $3500.

New NVIDIA Quadro FX 5800 Graphics Card Featuring CUDA Massively Parallel
Processing Architecture; Offers Most Robust Performance and Features to
Date for Oil and Gas Exploration, Medical Imaging, and Styling and Design
Applications

SANTA CLARA, Calif., Nov. 10 /PRNewswire/ — Professionals searching
for oil, diagnosing illness or styling the next high-performance luxury
vehicle all have one thing in common, the need for advanced visual
computing solutions. NVIDIA Corporation, the world leader in visual
computing technologies, today unveiled the most powerful professional
graphics card in graphics history — the NVIDIA(R) Quadro(R) FX 5800.

"The size and complexity of data is growing at an exponential rate. The
challenge for today's professional is to make sense of the mountain of data
by distilling it into a form they can comprehend, analyze and use to make
impactful decisions," said Jeff Brown, general manager, Professional
Solutions, NVIDIA. "At stake can be billions of investment dollars, or even
people's lives. The Quadro FX 5800 has advanced features to allow massive
datasets to be viewed beyond traditional 3D enabling professionals to make
fast and accurate decisions."

The Quadro FX 5800 graphics card offers unprecedented performance and
scalability to rapidly visualize and interpret massive datasets that until
now were unattainable on a workstation graphics board. Offering up to 240
CUDA(TM) programmable parallel cores and the industries first 4GB of
graphics memory, the Quadro FX 5800 graphics card is ideally suited for oil
and gas exploration, medical imaging, styling and design, and scientific
visualization. Other advanced features of the Quadro FX 5800 graphics card
include:

— Interactive 4D modeling with time lapse capabilities
— Massive memory bandwidth of up to 102 GB per second
— Fill rates that exceed 52 billion texels per second and geometry
performance of 300 million triangles per second
— Support for next-generation OpenGL and Microsoft DirectX 10
applications
— Advanced multi-system and multi-device visualization environments with
Quadro G-Sync II

"Landmark's recently launched GeoProbe(R) R5000 software empowers
geoscientists with an unprecedented ability to visualize large-scale
regional datasets at full resolution from a standard Linux(R) workstation,"
said Nicholas Purday, manager of Geophysical Technologies at Landmark. "The
NVIDIA Quadro FX 5800 graphics card has a more powerful GPU and superior
triangle performance, which make it possible for the GeoProbe application
to quickly render large surfaces, and allow us to move many
computing-intensive processes to the graphics card, significantly enhancing
the overall user experience." Landmark is an industry leading software and
technology services brand of Halliburton, one of the world's largest
providers of products and services to the energy industry.

"The advanced textured graphics capabilities of the Quadro FX 5800 are
enabling CyberHeart to provide 3D radiosurgical target visualization and
definition tools for the purpose of treating cardiac arrhythmias," said
Thilaka Sumanaweera, CTO, CyberHeart. "Our applications are processing very
large data sets acquired by the state-of-the-art 64-slice CT scanners using
respiratory- and cardiac-gating. The Quadro FX cards provide us with the
extreme bandwidth necessary to support our cutting-edge technology, and
essentially, save lives." CyberHeart, Inc., is a medical device company
developing a non-invasive radiosurgical system for cardiac applications.

The Quadro FX 5800 GPU features true 10-bit color enabling billions
rather than millions of color variations for rich, vivid image quality with
the broadest dynamic range. Professionals now benefit from viewing their
models with higher degrees of precision and realism never before possible.

"Our customers are making important decisions about future products on
the basis of RTT-powered 3D real-time models," said Ludwig Fuchs, cofounder
and CEO of RTT. "The new Quadro FX 5800 will be the platform of choice to
bring that arena to the next level. Higher levels of realism, physical
correctness and large models are now made possible through a double number
of cores and a generous frame buffer." Realtime Technology AG, is a leading
supplier of real- time visualization technology and virtual prototyping
solutions to the automotive, aerospace, industrial and consumer goods
design industries.

Pricing and Availability

NVIDIA Quadro solutions are widely available through leading PC
manufacturers and workstation system integrators and NVIDIA channel
partners PNY Technologies (US and EMEA), Leadtek (APAC) and Elsa (Japan).
The Quadro FX 5800 graphics board has an MSRP of $3499 USD. For more
information about the full lineup of NVIDIA professional solutions please
visit http://www.nvidia.com/quadro.

]]>
http://gizmodo.com/index.php?op=postcommentfeed&postId=5081976&view=rss&microfeed=true
<![CDATA[Windows 7 Getting (Kinda) Optimized for Parallel Processing]]> Besides looking a lot like Vista—and we mean a lot—Microsoft has said Windows 7 uses a lot of the same foundation, too, to keep upgrade migraines to a minimum. The problem is that its core ain't so suited to parallel computing, one of rival Snow Leopard's few headline features. So they're actually implementing some deep-level tweaks to bring it up to speed and make it more parallel processing friendly.

It's actually a significant process, since as Craig Mundie, Microsoft’s Chief Research and Strategy Officer, admits, “Win32 was never designed for highly concurrent, asynchronous processing" and "parallelism requires adjustments at every level of the stack." The first steps toward the larger project of re-arranging tasks and runtimes in different layers to take advantage of multiple-core will happen in Windows 7 though, such as an updated scheduler. There will be other adjustments along these lines as well, though we probably won't know everything until October.

So while it's unlikely that Windows 7 be as deeply in tune with parallel processing as Snow Leopard looks to be, it'll definitely be able to use a SWAT team of cores better than your Vista box will, and set the stage for Windows 8 to have a solid parallel processing foundation. [ZD Net]

]]>
http://gizmodo.com/index.php?op=postcommentfeed&postId=5056961&view=rss&microfeed=true
<![CDATA[Wooden Squirrel Cage Machine Obsesses Over Your Thoughts For You]]> Columbia professor Douglas Irving Repetto designed this crazy looking project which allows humans to write obsessive thoughts on scraps of paper, deposit them in one of seven squirrel cages, and spin them round and round to let the machine obsess for them. Made with grape arbor, glue, rubber bands and a laser cutter, the apparatus utilizes “parallel processing to the age-old problem of broken human minds.” Yeah, I'm not sure I quite get it either, but it sure is pretty. Check out Repetto's site to see a video of his Distributed Squirrel Cage for Parallel Processing in action! [Douglas Repetto via MAKE]

]]>
http://gizmodo.com/index.php?op=postcommentfeed&postId=5051531&view=rss&microfeed=true
<![CDATA[Microsoft: DirectX 11 To Use GPU For Parallel Processing]]> DirectX 11 is coming, and it looks pretty awesome. Sure, you get advancements in shading and better support for multi-core machines, but what's really got our heads turning is the concept of letting programmers use the GPU in your video card to do some of the heavy lifting, meaning your graphics chip becomes a second, parallel processor. While the idea itself isn't new, this is the first we've heard of DirectX using such technology and we're sure it'll have PC gaming fanboys drooling when it rolls out, whenever that happens to be. [Joystique]

]]>
http://gizmodo.com/index.php?op=postcommentfeed&postId=5028013&view=rss&microfeed=true
<![CDATA[Giz Explains: Mac OS 10.6 Snow Leopard Parallel Processing and GPU Computing]]> As you've probably heard, the next version of OS X, Snow Leopard, will not wow us with a crazy circus of features like Time Machine and Boot Camp. So why would Apple spend a year programming an OS that they can't boast has over 300 new features? Here's a quick rundown of how Apple is totally rebuilding OS X to take advantage of Core 2 Duos, graphics cards and parallel processing, in order to deliver serious performance gains. And yes, that is a big deal.

This is not going to be a super technical breakdown of parallel computing for the super nerdy, just a rough overview for my mom. Basically, parallel processing is what it sounds like: Multiple computations or processes or um, just "things," are carried out or done simultaneously, in parallel (at the same time!). Multi-core processors like Intel's ubiquitous Core 2 Duo have quickly become mainstream. They're really good are doing several things at once, since each processor core can crunch away on something—more cores, more simultaneous Captain Crunching, more faster. A brilliant consumer taste of this was actually Rosetta on OS X—on a dual-core system, one core would be "translating" the code from the PPC version, while the other ran the program (roughly speaking).

Sounds gravy right? Well, as Steve alluded in his explanation of Snow Leopard, parallel programs ain't easy to write—they're harder than sequential ones for sure, 'cause it requires the kind of math that can be broken up into little parts you can solve independently and then put back together again. Artificial intelligence, for instance, is not cakey for this. On the other hand, something like tomography—a technique for creating 3D images—totally is, because it's highly vectorizable. Or video stuff (cause you can easily divvy up the chores), videogame graphics and physics, generally.

No surprise that modern graphics cards are actually really good at parallel processing, 'cause of the way they're architected and because they usually have a buttload of cores—Nvidia's latest high-end GeForce card, the GTX 280, has 240. (It's why they're suitable for cheap supercomputers.) Nvidia, for instance, showed me some of the insane physics jujitsu the GTX 280 can pull off, it and ATI both have crazy new graphics cards (FireStream 9250 and Tesla 10P) built for "general purpose" supercomputing. Sony's Cell is sorta like this with multiple cores, but none of these are very good general processors the way stuff is designed now. (You don't see any computers running on an ATI Radeon CPU, or Cell handling the main workload on Toshiba's new laptops, do you?)

You'll note that part of Snow Leopard's feature list is OpenCL, an easy way for developers to tap the parallel processing power of graphics cards, in addition to being optimized for multiple cores courtesy of its "Grand Central" tech set. So Snow Leopard is pretty much all about parallel processing. (Microsoft hasn't been overly vocal about Windows and parallel computing.)

From what Apple has said—and the whole "Grand Central" deal (it "takes full advantage by making all of Mac OS X multicore aware and optimizing it for allocating tasks across multiple cores and processors")—it's clear that Apple is totally re-architecting Snow Leopard around parallel processing, with Grand Central acting much like the real one, organizing, assigning and scheduling a whole bunch of tasks/trains along a bunch of different paths/tracks. It's a major undertaking—Intel and Microsoft are throwing a ton of money at parallel computing themselves—and we're pretty curious about Apple is going to make parallel programming easier for programmers in a way supposedly no one's done before.

Something we missed, or you still wanna know? Send any questions about processors, prostates, Bananas or anything else to tips@gizmodo.com, with "Giz Explains" in the subject line.

]]>
http://gizmodo.com/index.php?op=postcommentfeed&postId=5017615&view=rss&microfeed=true
<![CDATA[Steve Jobs Explains OS X Snow Leopard in Three Easy Steps]]> The NY Times has a good interview with Steve Jobs in which Apple's CEO lets fly with very quotable, very understandable quotes about OS X 10.6. We already heard the details, but it was still hard to wrap our head around why Apple would make an operating system without many visible features and just go and change architecture around. He explains that they're doing it because programmers don't know WTF is going on with parallel processing.

1.

The way the processor industry is going is to add more and more cores, but nobody knows how to program those things. I mean, two, yeah; four, not really; eight, forget it.

Jobs claims that Apple's made a "breakthrough" in parallel-programming called Grand Central, which he alluded to in his keynote yesterday. He didn't, however, go into details about how it works and why it's going to revolutionize dividing up tasks into multiple processors in ways that other operating systems haven't yet.

What's also interesting is the ability to bring the GPU (your graphics card) into the processing role to help out your CPU. Apple's calling this newly proposed standard OpenCL (Open Compute Library).

2.

Basically it lets you use graphics processors to do computation. It’s way beyond what Nvidia or anyone else has, and it’s really simple.

It's vaguely similar to the way that Photoshop CS 4 will use your graphics card to help process image manipulation and help out in rendering 3D models as well.

Will there be more features like Time Machine? Not according to Jobs.
3.

“We’ve added over a thousand features to Mac OS X in the last five years,” he said Monday in an interview after his presentation. “We’re going to hit the pause button on new features.”

Seems to us that Snow Leopard won't be heavy on the features, but it will increase processing speeds for people who are heavy on the processing in their daily computing and have more than just a few cores—a place we're all heading to in the next few years. [NYT]

]]>
http://gizmodo.com/index.php?op=postcommentfeed&postId=5015116&view=rss&microfeed=true