summaryrefslogtreecommitdiff
path: root/drivers/misc/habanalabs/habanalabs.h
AgeCommit message (Collapse)Author
2019-09-05habanalabs: stop using the acronym KMDOded Gabbay
We want to stop using the acronym KMD. Therefore, replace all locations (except for register names we can't modify) where KMD is written to other terms such as "Linux kernel driver" or "Host kernel driver", etc. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-09-05habanalabs: add uapi to retrieve aggregate H/W eventsOded Gabbay
Add a new opcode to INFO IOCTL to retrieve aggregate H/W events. i.e. the events counters are NOT cleared upon device reset, but count from the loading of the driver. Add the code to support it in the device event handling function. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-09-05habanalabs: add uapi to retrieve device utilizationOded Gabbay
Users and sysadmins usually want to know what is the device utilization as a level 0 indication if they are efficiently using the device. Add a new opcode to the INFO IOCTL that will return the device utilization over the last period of 100-1000ms. The return value is 0-100, representing as percentage the total utilization rate. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-09-05habanalabs: Expose devices after initialization is doneTomer Tayar
The char devices are currently exposed to user before the device and driver initialization are done. This patch moves the cdev and device adding to the system to the end of the initialization sequence, while keeping the creation of the structures at the beginning to allow the usage of dev_*(). Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05habanalabs: create two char devices per ASICOded Gabbay
This patch changes the driver to create two char devices for each ASIC it discovers. This is done to allow system/monitoring applications to query the device for stats, information, idle state and more, while also allowing the deep-learning application to send work to the ASIC. One char device is the original device, hlX. IOCTL calls through this device file can perform any task on the device (compute, memory, queries). The open function for this device will fail if it was called before but the file-descriptor it created was not completely released yet (the release callback function is not called from the kernel until all instances of that FD are closed). The driver needs to keep this behavior to support backward compatibility with existing userspace, which count that the open will fail if the device is "occupied". The second char device is called "hl_controlDx", where x is the same index of the main device with a minor number of the original char device + 1. Applications that open this device can only call the INFO IOCTL. There is no limitation on the number of applications opening this device. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05habanalabs: maintain a list of file private data objectsOded Gabbay
This patch adds a new list to the driver's device structure. The list will keep the file private data structures that the driver creates when a user process opens the device. This change is needed because it is useless to try to count how many FD are open. Instead, track our own private data structure per open file and once it is released, remove it from the list. As long as the list is not empty, it means we have a user that can do something with our device. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05habanalabs: rename user_ctx as compute_ctxOded Gabbay
This patch renames the "user_ctx" field in the device structure to "compute_ctx". This better reflects the meaning of this context. In addition, we also check in the ctx_fini() that the debug mode should be disabled only if the context being destroyed is the compute context. This has no effect right now as we only have a single process and a single context, but this makes the code more ready for multiple process support. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05habanalabs: add handle field to context structureOded Gabbay
This patch adds a field to the context's structure that will hold a unique handle for the context. This will be needed when the user will create the context. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05habanalabs: cap simulator timeoutOded Gabbay
In the driver timeout functions, we give the simulator a factor of 10 in the timeout. This was necessary when the requested timeout is small but if it was a few seconds, this can result in a very large timeout which is unnecessary. This patch caps the maximum timeout of the simulator to 10 seconds, which is our largest timeout in the code. That is more then enough for anything the simulator is doing. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-08-12habanalabs: fix endianness handling for internal QMAN submissionOded Gabbay
The PQs of internal H/W queues (QMANs) can be located in different memory areas for different ASICs. Therefore, when writing PQEs, we need to use the correct function according to the location of the PQ. e.g. if the PQ is located in the device's memory (SRAM or DRAM), we need to use memcpy_toio() so it would work in architectures that have separate address ranges for IO memory. This patch makes the code that writes the PQE to be ASIC-specific so we can handle this properly per ASIC. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Tested-by: Ben Segal <bpsegal20@gmail.com>
2019-07-29habanalabs: fix host memory polling in BE architectureBen Segal
This patch fix a bug in the host memory polling macro. The bug is that the memory being polled can be written by the device, which always writes it in LE. However, if the host is running Linux in BE mode, we need to convert the value that was written by the device before matching it to the required value that the caller has given to the macro. Signed-off-by: Ben Segal <bpsegal20@gmail.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-07-01habanalabs: Add busy engines bitmask to HW idle IOCTLTomer Tayar
The information which is currently provided as a response to the "HL_INFO_HW_IDLE" IOCTL is merely a general boolean value. This patch extends it and provides also a bitmask that indicates which of the device engines are busy. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-07-01habanalabs: Add debugfs node for engines statusTomer Tayar
Command submissions sent to the device are composed of command buffers which are targeted to different device engines, like DMA and compute entities. When a command submission gets stuck, knowing in which engine the stuck is, is crucial for debugging. This patch adds a debugfs node that exports this information, by displaying the engines' various registers that assemble their idle/busy status. The information retrieval is based on the is_device_idle ASIC function. The printout in this function, of the first detected busy engine, is removed because it becomes redundant in the presence of the more elaborated info of the new debugfs node. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-05-29habanalabs: add MMU mappings for Goya CPUOded Gabbay
This patch adds the necessary MMU mappings for the Goya CPU to access the device DRAM and the host memory. The first 256MB of the device DRAM is being mapped. That's where the F/W is running. The 2MB area located on the host memory for the purpose of communication between the driver and the device CPU is also being mapped. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-05-17habanalabs: don't limit packet size for device CPUOded Gabbay
This patch removes a limitation on the maximum packet size that is read by the device CPU as that limitation is not needed. Therefore, the patch also removes an elaborate calculation that is based on this limitation which is also not needed now. Instead, use a fixed value for the memory pool size of the packets. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-05-13habanalabs: increase PCI ELBI timeout for PalladiumOmer Shpigelman
This patch increases the timeout for PCI ELBI configuration to support low frequency Palladium images. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-05-12habanalabs: pass device pointer to asic-specific functionOded Gabbay
This patch adds a new parameter that is passed to the add_end_of_cb_packets() asic-specific function. The parameter is the pointer to the driver's device structure. The function needs this pointer for future ASICs. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-05-09habanalabs: change polling functions to macrosOded Gabbay
This patch changes two polling functions to macros, in order to make their API the same as the standard readl_poll_timeout so we would be able to define the "condition for exit" when calling these macros. This will simplify the code as it will eliminate the need to check both for timeout and for the (cond) in the calling function. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-05-04habanalabs: force user to set device debug modeOded Gabbay
This patch adds the implementation of the HL_DEBUG_OP_SET_MODE opcode in the DEBUG IOCTL. It forces the user who wants to debug the device to set the device into debug mode before he can configure the debug engines. The patch also makes sure to disable debug mode upon user releasing FD, in case the user forgot to disable debug mode. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-05-05habanalabs: minor documentation and prints fixesOmer Shpigelman
This patch fixes comments on various structure members and some spelling errors in log messages. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-05-24habanalabs: halt debug engines on user process closeOmer Shpigelman
This patch fix a potential bug where a user's process has closed unexpectedly without disabling the debug engines. In that case, the debug engines might continue running but because the user's MMU mappings are going away, we will get page fault errors. This behavior is also opposed to the general rule where nothing runs on the device after the user process closes. The patch stops the debug H/W engines upon process termination and thus makes sure nothing runs on the device after the process goes away. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-30habanalabs: increase timeout if working with simulatorDalit Ben Zoor
Where there is a spike in the CPU consumption, it may cause random failures in the C/I since the KMD timeout for CPU and/or QMAN0 jobs expires and it stops communicating to the simulator. This commit fixes it by increasing timeout on polling functions if working with simulator. Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-05-01habanalabs: remove redundant member from parser structDalit Ben Zoor
use_virt_addr member was used for telling whether to treat the addresses in the CB as virtual during parsing. We disabled it only when calling the parser from the driver memset device function, and since this call had been removed, it should always be enabled. Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-05-01habanalabs: Manipulate DMA addresses in ASIC functionsTomer Tayar
Routing device accesses to the host memory requires the usage of a base offset, which is canceled by the iATU just before leaving the device. The value of the base offset might be distinctive between different ASIC types. The manipulation of the addresses is currently used throughout the driver code, and one should be aware to it whenever providing a host memory address to the device. This patch removes this manipulation from the driver common code, and moves it to the ASIC specific functions that are responsible for host memory allocation/mapping. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-05-01habanalabs: rename functions to improve code readabilityOded Gabbay
This patch renames four functions in the ASIC-specific functions section, so it will be easier to differentiate them from the generic kernel functions with the same name. This will help in future code reviews, to make sure we don't use the kernel functions directly. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-28habanalabs: Use single pool for CPU accessible host memoryTomer Tayar
The device's CPU accessible memory on host is managed in a dedicated pool, except for 2 regions - Primary Queue (PQ) and Event Queue (EQ) - which are allocated from generic DMA pools. Due to address length limitations of the CPU, the addresses of all these memory regions must have the same MSBs starting at bit 40. This patch modifies the allocation of the PQ and EQ to be also from the dedicated pool, to ensure compliance with the limitation. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-28habanalabs: return old dram bar address upon changeOded Gabbay
This patch changes the ASIC interface function that changes the DRAM bar window. The change is to return the old address that the DRAM bar pointed to instead of an error code. This simplifies the code that use this function (mainly in debugfs) to restore the bar to the old setting. This is also needed for easier support in future ASICs. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-25habanalabs: rename restore to ctx_switch when appropriateOded Gabbay
This patch only does renaming of certain variables and structure members, and their accompanied comments. This is done to better reflect the actions these variables and members represent. There is no functional change in this patch. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-22habanalabs: use ASIC functions interface for rreg/wregOded Gabbay
This patch slightly changes the macros of RREG32 and WREG32, which are used when reading or writing from registers. Instead of directly calling a function in the common code from these macros, the new code calls a function from the ASIC functions interface. This change allows us to share much more code between real ASICs and simulators, which in turn reduces the maintenance burden and the chances for forgetting to port code between the ASIC files. The patch also implements the hl_poll_timeout macro, instead of calling the generic readl_poll_timeout macro. This is required to allow use of this macro in the simulator files. As a result from this change, more functions in goya.c are shared with the simulator and therefore, should not be defined as static. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-10habanalabs: Cancel pr_fmt() definition dependency on includes orderTomer Tayar
pr_fmt() should be defined before including linux/printk.h, either directly or indirectly, in order to avoid redefinition of the macro. Currently the macro definition is in habanalabs.h, which is included in many files, and that makes the addition/reorder of includes to be prone to compilation errors. This patch cancels this dependency by defining the macro only in the few source files that use it. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-04habanalabs: ASIC_AUTO_DETECT enum value is redundantOded Gabbay
This patch removes the enum value of ASIC_AUTO_DETECT because we can use the validity of the pdev variable to know whether we have a real device or a simulator. For a real device, we detect the asic type from the device ID while for a simulator, the simulator code calls create_hdev() with the specified ASIC type. Set ASIC_INVALID as the first option in the enum to make sure that no other enum value will receive the value 0 (which indicates a non-existing entry in the simulator array). Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-02habanalabs: refactoring in goya.cOded Gabbay
This patch does some refactoring in goya.c to make code more reusable between goya code and the goya simulator code (which is not upstreamed). In addition, the patch removes some dead functions from goya.c which are not used by the current upstream code Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-01habanalabs: add new IOCTL for debug, tracing and profilingOmer Shpigelman
Habanalabs ASICs use the ARM coresight infrastructure to support debug, tracing and profiling of neural networks topologies. Because the coresight is configured using register writes and reads, and some of the registers hold sensitive information (e.g. the address in the device's DRAM where the trace data is written to), the user must go through the kernel driver to configure this mechanism. This patch implements the common code of the IOCTL and calls the ASIC-specific function for the actual H/W configuration. The IOCTL supports configuration of seven coresight components: ETR, ETF, STM, FUNNEL, BMON, SPMU and TIMESTAMP The user specifies which component he wishes to configure and provides a pointer to a structure (located in its process space) that contains the relevant configuration. The common code copies the relevant data from the user-space to kernel space and then calls the ASIC-specific function to do the H/W configuration. After the configuration is done, which is usually composed of several IOCTL calls depending on what the user wanted to trace, the user can start executing the topology. The trace data will be written to the user's area in the device's DRAM. After the tracing operation is complete, and user will call the IOCTL again to disable the tracing operation. The user also need to read values from registers for some of the components (e.g. the size of the trace data in the device's DRAM). In that case, the user will provide a pointer to an "output" structure in user-space, which the IOCTL code will fill according the to selected component. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-24habanalabs: add device status option to INFO IOCTLDalit Ben Zoor
This patch adds a new opcode to INFO IOCTL that returns the device status. This will allow users to query the device status in order to avoid sending command submissions while device is in reset. Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-07habanalabs: keep track of the device's dma maskOded Gabbay
This patch refactors the code that is responsible to set the DMA mask for the device. Upon each change of the dma mask, the driver will save the new value that was set. This is needed in order to make sure we don't try to increase the mask a second time, in case we failed in the first time. This is especially relevant for Power machines, as that may cause a change in configuration of the TVT which will break the device. Goya will first try to set the device's dma mask to 39 bits, so that the memory that is allocated on the host machine for communication with the device's cpu will be in a bus address which is lower then 39 bits. Later, Goya will try to increase that mask to 48 bits, but only if setting the mask to 39 bits was successful. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-02-24habanalabs: add MMU shadow mappingOmer Shpigelman
This patch adds shadow mapping to the MMU module. The shadow mapping allows traversing the page table in host memory rather reading each PTE from the device memory. It brings better performance and avoids reading from invalid device address upon PCI errors. Only at the end of map/unmap flow, writings to the device are performed in order to sync the H/W page tables with the shadow ones. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-07habanalabs: Add a printout with the name of a busy engineTomer Tayar
Print the name of a busy engine when checking if a device is idle. The change is done mainly to help a user to pinpoint problems in his topology's recipe. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-05habanalabs: Move PCI code into common fileTomer Tayar
Move duplicated PCI-related code from ASIC-specific files into the common pci.c file. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-04habanalabs: Move device CPU code into common fileTomer Tayar
This patch moves the code that is responsible of the communication vs. the F/W to a dedicated file. This will allow us to share the code between different ASICs. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-03habanalabs: perform accounting for active CSOded Gabbay
This patch adds accounting for active CS. Active means that the CS was submitted to the H/W queues and was not completed yet. This is necessary to support suspend operation. Because the device will be reset upon suspend, we can only suspend after all active CS have been completed. Hence, we need to perform accounting on their number. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-05habanalabs: fix MMU number of pages calculationOmer Shpigelman
The requested allocation size is 64bit, hence the number of requested pages and the total requested size should 64bit as well. This patch fixes all places where these are treated as 32bit. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-02-28habanalabs: disable CPU access on timeoutsOded Gabbay
This patch provides a workaround for a bug in the F/W where the response time for a request from KMD may take more then 100ms. This could cause the queue between KMD and the F/W to get out of sync. The WA is to: 1. Increase the timeout of ALL requests to 1s. 2. In case a request isn't answered in time, mark the state as "cpu_disabled" and prevent sending further requests from KMD to the F/W. This will eventually lead to a heartbeat failure and hard reset of the device. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-28habanalabs: add MMU DRAM default page mappingOmer Shpigelman
This patch provides a workaround for a H/W bug in Goya, where access to RAZWI from TPC can cause PCI completion timeout. The WA is to use the device MMU to map any unmapped DRAM memory to a default page in the DRAM. That way, the TPC will never reach RAZWI upon accessing a bad address in the DRAM. When a DRAM page is mapped by the user, its default mapping is overwritten. Once that page is unmapped, the MMU driver will map that page to the default page. To help debugging, the driver will set the default page area to 0x99 on device initialization. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-27habanalabs: make functions static or declare themOded Gabbay
This patch fixes the below sparse warnings by either making the functions static or by adding a declaration in the relevant header file. In addition, the patch removes goya_mmap completely as it doesn't add any additional benefit. Fixes the following sparse warnings: drivers/misc/habanalabs/habanalabs_drv.c:24:1: warning: symbol 'hl_devs_idr' was not declared. Should it be static? drivers/misc/habanalabs/habanalabs_drv.c:25:1: warning: symbol 'hl_devs_idr_lock' was not declared. Should it be static? drivers/misc/habanalabs/memory.c:1451:5: warning: symbol 'hl_vm_ctx_init_with_ranges' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:396:5: warning: symbol 'goya_send_pci_access_msg' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:417:5: warning: symbol 'goya_pci_bars_map' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:557:6: warning: symbol 'goya_reset_link_through_bridge' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:774:5: warning: symbol 'goya_early_fini' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:857:6: warning: symbol 'goya_late_fini' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:971:5: warning: symbol 'goya_sw_fini' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:1233:5: warning: symbol 'goya_init_cpu_queues' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:2914:5: warning: symbol 'goya_suspend' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:2939:5: warning: symbol 'goya_resume' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:2952:5: warning: symbol 'goya_mmap' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:2957:5: warning: symbol 'goya_cb_mmap' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:2973:6: warning: symbol 'goya_ring_doorbell' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3063:6: warning: symbol 'goya_flush_pq_write' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3068:6: warning: symbol 'goya_dma_alloc_coherent' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3074:6: warning: symbol 'goya_dma_free_coherent' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3080:6: warning: symbol 'goya_get_int_queue_base' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3138:5: warning: symbol 'goya_send_job_on_qman0' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3295:5: warning: symbol 'goya_test_queue' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3417:6: warning: symbol 'goya_dma_pool_zalloc' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3426:6: warning: symbol 'goya_dma_pool_free' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3432:6: warning: symbol 'goya_cpu_accessible_dma_pool_alloc' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3448:6: warning: symbol 'goya_cpu_accessible_dma_pool_free' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3458:5: warning: symbol 'goya_dma_map_sg' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3467:6: warning: symbol 'goya_dma_unmap_sg' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3473:5: warning: symbol 'goya_get_dma_desc_list_size' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:4210:5: warning: symbol 'goya_parse_cb_no_mmu' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:4261:5: warning: symbol 'goya_parse_cb_no_ext_quque' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:4294:5: warning: symbol 'goya_cs_parser' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:4307:6: warning: symbol 'goya_add_end_of_cb_packets' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:4334:5: warning: symbol 'goya_context_switch' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:4426:6: warning: symbol 'goya_restore_phase_topology' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:4460:5: warning: symbol 'goya_debugfs_read32' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:4510:5: warning: symbol 'goya_debugfs_write32' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:4738:6: warning: symbol 'goya_handle_eqe' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:4836:6: warning: symbol 'goya_get_events_stat' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:5075:5: warning: symbol 'goya_send_heartbeat' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:5253:5: warning: symbol 'goya_get_eeprom_data' was not declared. Should it be static? Reported-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-27habanalabs: allow memory allocations larger than 4GBOded Gabbay
This patch increase the size field in the uapi structure of the Memory IOCTL from 32-bit to 64-bit. This is to allow the user to allocate and/or map memory in chunks that are larger then 4GB. Goya's device memory (DRAM) can be up to 16GB, and for certain topologies, the user may want an allocation that is larger than 4GB. This change doesn't break current user-space because there was a "pad" field in the uapi structure right after the size field. Changing the size field to be 64-bit and removing the pad field maintains compatibility with current user-space. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18habanalabs: add debugfs supportOded Gabbay
This patch adds debugfs support to the driver. It allows the user-space to display information that is contained in the internal structures of the driver, such as: - active command submissions - active user virtual memory mappings - number of allocated command buffers It also enables the user to perform reads and writes through Goya's PCI bars. Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18habanalabs: implement INFO IOCTLOded Gabbay
This patch implements the INFO IOCTL. That IOCTL is used by the user to query information that is relevant/needed by the user in order to submit deep learning jobs to Goya. The information is divided into several categories, such as H/W IP, Events that happened, DDR usage and more. Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18habanalabs: add virtual memory and MMU modulesOmer Shpigelman
This patch adds the Virtual Memory and MMU modules. Goya has an internal MMU which provides process isolation on the internal DDR. The internal MMU also performs translations for transactions that go from Goya to the Host. The driver is responsible for allocating and freeing memory on the DDR upon user request. It also provides an interface to map and unmap DDR and Host memory to the device address space. The MMU in Goya supports 3-level and 4-level page tables. With 3-level, the size of each page is 2MB, while with 4-level the size of each page is 4KB. In the DDR, the physical pages are always 2MB. Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18habanalabs: add command submission moduleOded Gabbay
This patch adds the main flow for the user to submit work to the device. Each work is described by a command submission object (CS). The CS contains 3 arrays of command buffers: One for execution, and two for context-switch (store and restore). For each CB, the user specifies on which queue to put that CB. In case of an internal queue, the entry doesn't contain a pointer to the CB but the address in the on-chip memory that the CB resides at. The driver parses some of the CBs to enforce security restrictions. The user receives a sequence number that represents the CS object. The user can then query the driver regarding the status of the CS, using that sequence number. In case the CS doesn't finish before the timeout expires, the driver will perform a soft-reset of the device. Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18habanalabs: add device reset supportOded Gabbay
This patch adds support for doing various on-the-fly reset of Goya. The driver supports two types of resets: 1. soft-reset 2. hard-reset Soft-reset is done when the device detects a timeout of a command submission that was given to the device. The soft-reset process only resets the engines that are relevant for the submission of compute jobs, i.e. the DMA channels, the TPCs and the MME. The purpose is to bring the device as fast as possible to a working state. Hard-reset is done in several cases: 1. After soft-reset is done but the device is not responding 2. When fatal errors occur inside the device, e.g. ECC error 3. When the driver is removed Hard-reset performs a reset of the entire chip except for the PCI controller and the PLLs. It is a much longer process then soft-reset but it helps to recover the device without the need to reboot the Host. After hard-reset, the driver will restore the max power attribute and in case of manual power management, the frequencies that were set. This patch also adds two entries to the sysfs, which allows the root user to initiate a soft or hard reset. Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>