2 Replies Latest reply: Mar 27, 2013 9:30 PM by Grant Tao RSS

The computing power of the Vivante's GC2000 GPU in i.mx6q

Grant Tao Level 1

Vivante provided the OpenCL driver of the GPU GC2000 in the FreeScale's four core
application processor I.MX6Q. From the returned information of the OpenCL driver, let's figure out the GC2000's structure, what type of data it can handle, and how powerful it is.

1. Important figures
Most important information we got from the output file is:
CL_PLATFORM_PROFILE: EMBEDDED_PROFILE
CL_PLATFORM_VERSION: OpenCL 1.1
Number of Available Computing Devices: 1
Computing Device Parameters: CL_DEVICE_NAME: Vivante OpenCL Device
CL_DEVICE_VENDOR: Vivante Corporation
CL_DEVICE_TYPE: GPU
CL_DEVICE_MAX_COMPUTE_UNITS: 4
CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3
CL_DEVICE_MAX_CLOCK_FREQUENCY: 500 MHz
CL_DEVICE_IMAGE_SUPPORT: Yes
So, the number of computing devices shows that this OpenCL driver provided by FreeScale only support GPU computing, the four ARM Cortex-A9 cpu cores are not supported, and GC2000 has 4 shaders (compute units), and the clock frequency is 500MHZ, it also support opencl image processing. That is really a good news.


2 Data type GC2000 can deal with
Following parameters described the data type that can be dealt:
CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR: 4
CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT: 4
CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT: 4
CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG: 0
CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT: 4
CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE: 0
CL_DEVICE_NATIVE_VECTOR_WIDTH_CHAR: 4
CL_DEVICE_NATIVE_VECTOR_WIDTH_SHORT: 4
CL_DEVICE_NATIVE_VECTOR_WIDTH_INT: 4
CL_DEVICE_NATIVE_VECTOR_WIDTH_LONG: 0
CL_DEVICE_NATIVE_VECTOR_WIDTH_FLOAT: 4
CL_DEVICE_NATIVE_VECTOR_WIDTH_DOUBLE: 0
these parameters show that gc2000 can deal with char (8bit), short int (16bit), int
(32bit), float(32bit), but not long and double.

3 GPU's Memory parameters
Following parameters are for the memory and cache used by the GPU
CL_DEVICE_ADDRESS_BITS: 32
CL_DEVICE_GLOBAL_MEM_SIZE: 96 MByte
CL_DEVICE_MAX_MEM_ALLOC_SIZE: 48 MByte
CL_DEVICE_GLOBAL_MEM_CACHE_TYPE: Read/Write
CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE: 64
CL_DEVICE_GLOBAL_MEM_CACHE_SIZE: 4096
CL_DEVICE_LOCAL_MEM_SIZE: 1 KByte
CL_DEVICE_LOCAL_MEM_TYPE: Global
CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 4 KByte
CL_DEVICE_MAX_CONSTANT_ARGS: 9

You can see that the local memory size of GC2000 is extremely small, only 1KBytes,
and the global memory cache size is only 64bytes, that sames ridiculous, all these
limitations made the gpu to get its most data directly from global memory, as the FreeScale
used a double 64bit AXI bus architecture, it extremely constrained the power of this
GPU.

in the next discussion, I will shall you the theoretical and actual computing power of this tiny GPU. less