diff options
Diffstat (limited to 'Documentation')
143 files changed, 8711 insertions, 5385 deletions
diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX index f08ca9535733..8b0563633442 100644 --- a/Documentation/00-INDEX +++ b/Documentation/00-INDEX @@ -12,6 +12,8 @@ Following translations are available on the WWW: 00-INDEX - this file. +ABI/ + - info on kernel <-> userspace ABI and relative interface stability. BUG-HUNTING - brute force method of doing binary search of patches to find bug. Changes @@ -25,37 +27,57 @@ DMA-mapping.txt DocBook/ - directory with DocBook templates etc. for kernel documentation. HOWTO - - The process and procedures of how to do Linux kernel development. + - the process and procedures of how to do Linux kernel development. IO-mapping.txt - how to access I/O mapped memory from within device drivers. IPMI.txt - info on Linux Intelligent Platform Management Interface (IPMI) Driver. IRQ-affinity.txt - how to select which CPU(s) handle which interrupt events on SMP. +IRQ.txt + - description of what an IRQ is. ManagementStyle - how to (attempt to) manage kernel hackers. MSI-HOWTO.txt - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ. +PCIEBUS-HOWTO.txt + - a guide describing the PCI Express Port Bus driver. RCU/ - directory with info on RCU (read-copy update). README.DAC960 - info on Mylex DAC960/DAC1100 PCI RAID Controller Driver for Linux. +README.cycladesZ + - info on Cyclades-Z firmware loading. SAK.txt - info on Secure Attention Keys. +SecurityBugs + - procedure for reporting security bugs found in the kernel. +SubmitChecklist + - Linux kernel patch submission checklist. SubmittingDrivers - procedure to get a new driver source included into the kernel tree. SubmittingPatches - procedure to get a source patch included into the kernel tree. VGA-softcursor.txt - how to change your VGA cursor from a blinking underscore. +accounting/ + - documentation on accounting and taskstats. +aoe/ + - description of AoE (ATA over Ethernet) along with config examples. applying-patches.txt - description of various trees and how to apply their patches. arm/ - directory with info about Linux on the ARM architecture. +atomic_ops.txt + - semantics and behavior of atomic and bitmask operations. +auxdisplay/ + - misc. LCD driver documentation (cfag12864b, ks0108). basic_profiling.txt - basic instructions for those who wants to profile Linux kernel. binfmt_misc.txt - info on the kernel support for extra binary formats. +blackfin/ + - directory with documentation for the Blackfin arch. block/ - info on the Block I/O (BIO) layer. cachetlb.txt @@ -68,16 +90,32 @@ cli-sti-removal.txt - cli()/sti() removal guide. computone.txt - info on Computone Intelliport II/Plus Multiport Serial Driver. +connector/ + - docs on the netlink based userspace<->kernel space communication mod. +console/ + - documentation on Linux console drivers. cpqarray.txt - info on using Compaq's SMART2 Intelligent Disk Array Controllers. cpu-freq/ - info on CPU frequency and voltage scaling. +cpu-hotplug.txt + - document describing CPU hotplug support in the Linux kernel. +cpu-load.txt + - document describing how CPU load statistics are collected. +cpusets.txt + - documents the cpusets feature; assign CPUs and Mem to a set of tasks. +cputopology.txt + - documentation on how CPU topology info is exported via sysfs. cris/ - directory with info about Linux on CRIS architecture. crypto/ - directory with info on the Crypto API. +dcdbas.txt + - information on the Dell Systems Management Base Driver. debugging-modules.txt - some notes on debugging modules after Linux 2.6.3. +dell_rbu.txt + - document demonstrating the use of the Dell Remote BIOS Update driver. device-mapper/ - directory with info on Device Mapper. devices.txt @@ -86,32 +124,52 @@ digiepca.txt - info on Digi Intl. {PC,PCI,EISA}Xx and Xem series cards. dnotify.txt - info about directory notification in Linux. +dontdiff + - file containing a list of files that should never be diff'ed. driver-model/ - directory with info about Linux driver model. +drivers/ + - directory with driver documentation (currently only EDAC). dvb/ - info on Linux Digital Video Broadcast (DVB) subsystem. early-userspace/ - info about initramfs, klibc, and userspace early during boot. +ecryptfs.txt + - docs on eCryptfs: stacked cryptographic filesystem for Linux. eisa.txt - info on EISA bus support. exception.txt - how Linux v2.2 handles exceptions without verify_area etc. +fault-injection/ + - dir with docs about the fault injection capabilities infrastructure. fb/ - directory with info on the frame buffer graphics abstraction layer. +feature-removal-schedule.txt + - list of files and features that are going to be removed. filesystems/ - directory with info on the various filesystems that Linux supports. firmware_class/ - request_firmware() hotplug interface info. floppy.txt - notes and driver options for the floppy disk driver. +fujitsu/ + - Fujitsu FR-V Linux documentation. +gpio.txt + - overview of GPIO (General Purpose Input/Output) access conventions. hayes-esp.txt - info on using the Hayes ESP serial driver. highuid.txt - notes on the change from 16 bit to 32 bit user/group IDs. hpet.txt - High Precision Event Timer Driver for Linux. +hrtimer/ + - info on the timer_stats debugging facility for timer (ab)use. +hrtimers/ + - info on the hrtimers subsystem for high-resolution kernel timers. hw_random.txt - info on Linux support for random number generator in i8xx chipsets. +hwmon/ + - directory with docs on various hardware monitoring drivers. i2c/ - directory with info about the I2C bus/protocol (2 wire, kHz speed). i2o/ @@ -122,16 +180,22 @@ ia64/ - directory with info about Linux on Intel 64 bit architecture. ide.txt - important info for users of ATA devices (IDE/EIDE disks and CD-ROMS). +infiniband/ + - directory with documents concerning Linux InfiniBand support. initrd.txt - how to use the RAM disk as an initial/temporary root filesystem. input/ - info on Linux input device support. io_ordering.txt - info on ordering I/O writes to memory-mapped addresses. +ioctl/ + - directory with documents describing various IOCTL calls. ioctl-number.txt - how to implement and register device/driver ioctl calls. iostats.txt - info on I/O statistics Linux kernel provides. +irqflags-tracing.txt + - how to use the irq-flags tracing feature. isapnp.txt - info on Linux ISA Plug & Play support. isdn/ @@ -140,26 +204,40 @@ java.txt - info on the in-kernel binary support for Java(tm). kbuild/ - directory with info about the kernel build process. -kdumpt.txt - - mini HowTo on getting the crash dump code to work. +kdump/ + - directory with mini HowTo on getting the crash dump code to work. kernel-doc-nano-HOWTO.txt - mini HowTo on generation and location of kernel documentation files. kernel-docs.txt - listing of various WWW + books that document kernel internals. kernel-parameters.txt - summary listing of command line / boot prompt args for the kernel. +keys-request-key.txt + - description of the kernel key request service. +keys.txt + - description of the kernel key retention service. kobject.txt - info of the kobject infrastructure of the Linux kernel. +kprobes.txt + - documents the kernel probes debugging feature. +kref.txt + - docs on adding reference counters (krefs) to kernel objects. laptop-mode.txt - - How to conserve battery power using laptop-mode. + - how to conserve battery power using laptop-mode. ldm.txt - a brief description of LDM (Windows Dynamic Disks). +leds-class.txt + - documents LED handling under Linux. +local_ops.txt + - semantics and behavior of local atomic operations. +lockdep-design.txt + - documentation on the runtime locking correctness validator. locks.txt - info on file locking implementations, flock() vs. fcntl(), etc. logo.gif - - Full colour GIF image of Linux logo (penguin). + - full colour GIF image of Linux logo (penguin - Tux). logo.txt - - Info on creator of above logo & site to get additional images from. + - info on creator of above logo & site to get additional images from. m68k/ - directory with info about Linux on Motorola 68k architecture. magic-number.txt @@ -170,6 +248,8 @@ mca.txt - info on supporting Micro Channel Architecture (e.g. PS/2) systems. md.txt - info on boot arguments for the multiple devices driver. +memory-barriers.txt + - info on Linux kernel memory barriers. memory.txt - info on typical Linux memory problems. mips/ @@ -177,9 +257,11 @@ mips/ mono.txt - how to execute Mono-based .NET binaries with the help of BINFMT_MISC. moxa-smartio - - info on installing/using Moxa multiport serial driver. + - file with info on installing/using Moxa multiport serial driver. mtrr.txt - how to use PPro Memory Type Range Registers to increase performance. +mutex-design.txt + - info on the generic mutex subsystem. nbd.txt - info on a TCP implementation of a network block device. netlabel/ @@ -190,6 +272,8 @@ nfsroot.txt - short guide on setting up a diskless box with NFS root filesystem. nmi_watchdog.txt - info on NMI watchdog for SMP systems. +nommu-mmap.txt + - documentation about no-mmu memory mapping support. numastat.txt - info on how to read Numa policy hit/miss statistics in sysfs. oops-tracing.txt @@ -202,8 +286,16 @@ parport.txt - how to use the parallel-port driver. parport-lowlevel.txt - description and usage of the low level parallel port functions. +pci-error-recovery.txt + - info on PCI error recovery. pci.txt - info on the PCI subsystem for device driver authors. +pcieaer-howto.txt + - the PCI Express Advanced Error Reporting Driver Guide HOWTO. +pcmcia/ + - info on the Linux PCMCIA driver. +pi-futex.txt + - documentation on lightweight PI-futexes. pm.txt - info on Linux power management support. pnp.txt @@ -214,18 +306,32 @@ powerpc/ - directory with info on using Linux with the PowerPC. preempt-locking.txt - info on locking under a preemptive kernel. +prio_tree.txt + - info on radix-priority-search-tree use for indexing vmas. ramdisk.txt - short guide on how to set up and use the RAM disk. +rbtree.txt + - info on what red-black trees are and what they are for. riscom8.txt - notes on using the RISCom/8 multi-port serial driver. +robust-futex-ABI.txt + - documentation of the robust futex ABI. +robust-futexes.txt + - a description of what robust futexes are. rocket.txt - info on the Comtrol RocketPort multiport serial driver. rpc-cache.txt - introduction to the caching mechanisms in the sunrpc layer. +rt-mutex-design.txt + - description of the RealTime mutex implementation design. +rt-mutex.txt + - desc. of RT-mutex subsystem with PI (Priority Inheritance) support. rtc.txt - notes on how to use the Real Time Clock (aka CMOS clock) driver. s390/ - directory with info on using Linux on the IBM S390. +sched-arch.txt + - CPU Scheduler implementation hints for architecture specific code. sched-coding.txt - reference for various scheduler-related methods in the O(1) scheduler. sched-design.txt @@ -240,22 +346,32 @@ serial/ - directory with info on the low level serial API. serial-console.txt - how to set up Linux with a serial line console as the default. +sgi-ioc4.txt + - description of the SGI IOC4 PCI (multi function) device. sgi-visws.txt - short blurb on the SGI Visual Workstations. sh/ - directory with info on porting Linux to a new architecture. +sharedsubtree.txt + - a description of shared subtrees for namespaces. smart-config.txt - description of the Smart Config makefile feature. smp.txt - a few notes on symmetric multi-processing. +sony-laptop.txt + - Sony Notebook Control Driver (SNC) Readme. sonypi.txt - info on Linux Sony Programmable I/O Device support. sound/ - directory with info on sound card support. sparc/ - directory with info on using Linux on Sparc architecture. +sparse.txt + - info on how to obtain and use the sparse tool for typechecking. specialix.txt - info on hardware/driver for specialix IO8+ multiport serial card. +spi/ + - overview of Linux kernel Serial Peripheral Interface (SPI) support. spinlocks.txt - info on using spinlocks to provide exclusive access in kernel. stable_api_nonsense.txt @@ -274,24 +390,32 @@ sysrq.txt - info on the magic SysRq key. telephony/ - directory with info on telephony (e.g. voice over IP) support. +thinkpad-acpi.txt + - information on the (IBM and Lenovo) ThinkPad ACPI Extras driver. time_interpolators.txt - info on time interpolators. tipar.txt - information about Parallel link cable for Texas Instruments handhelds. tty.txt - guide to the locking policies of the tty layer. -unicode.txt - - info on the Unicode character/font mapping used in Linux. uml/ - directory with information about User Mode Linux. +unicode.txt + - info on the Unicode character/font mapping used in Linux. +unshare.txt + - description of the Linux unshare system call. usb/ - directory with info regarding the Universal Serial Bus. +video-output.txt + - sysfs class driver interface to enable/disable a video output device. video4linux/ - directory with info regarding video/TV/radio cards and linux. vm/ - directory with info on the Linux vm code. voyager.txt - guide to running Linux on the Voyager architecture. +w1/ + - directory with documents regarding the 1-wire (w1) subsystem. watchdog/ - how to auto-reboot Linux if it has "fallen and can't get up". ;-) x86_64/ diff --git a/Documentation/ABI/removed/raw1394_legacy_isochronous b/Documentation/ABI/removed/raw1394_legacy_isochronous new file mode 100644 index 000000000000..1b629622d883 --- /dev/null +++ b/Documentation/ABI/removed/raw1394_legacy_isochronous @@ -0,0 +1,16 @@ +What: legacy isochronous ABI of raw1394 (1st generation iso ABI) +Date: June 2007 (scheduled), removed in kernel v2.6.23 +Contact: linux1394-devel@lists.sourceforge.net +Description: + The two request types RAW1394_REQ_ISO_SEND, RAW1394_REQ_ISO_LISTEN have + been deprecated for quite some time. They are very inefficient as they + come with high interrupt load and several layers of callbacks for each + packet. Because of these deficiencies, the video1394 and dv1394 drivers + and the 3rd-generation isochronous ABI in raw1394 (rawiso) were created. + +Users: + libraw1394 users via the long deprecated API raw1394_iso_write, + raw1394_start_iso_write, raw1394_start_iso_rcv, raw1394_stop_iso_rcv + + libdc1394, which optionally uses these old libraw1394 calls + alternatively to the more efficient video1394 ABI diff --git a/Documentation/ABI/testing/sysfs-bus-usb b/Documentation/ABI/testing/sysfs-bus-usb index f9937add033d..9734577d1711 100644 --- a/Documentation/ABI/testing/sysfs-bus-usb +++ b/Documentation/ABI/testing/sysfs-bus-usb @@ -39,3 +39,16 @@ Description: If you want to suspend a device immediately but leave it free to wake up in response to I/O requests, you should write "0" to power/autosuspend. + +What: /sys/bus/usb/devices/.../power/persist +Date: May 2007 +KernelVersion: 2.6.23 +Contact: Alan Stern <stern@rowland.harvard.edu> +Description: + If CONFIG_USB_PERSIST is set, then each USB device directory + will contain a file named power/persist. The file holds a + boolean value (0 or 1) indicating whether or not the + "USB-Persist" facility is enabled for the device. Since the + facility is inherently dangerous, it is disabled by default + for all devices except hubs. For more information, see + Documentation/usb/persist.txt. diff --git a/Documentation/CodingStyle b/Documentation/CodingStyle index b49b92edb396..7f1730f1a1ae 100644 --- a/Documentation/CodingStyle +++ b/Documentation/CodingStyle @@ -218,6 +218,18 @@ no space after the prefix increment & decrement unary operators: and no space around the '.' and "->" structure member operators. +Do not leave trailing whitespace at the ends of lines. Some editors with +"smart" indentation will insert whitespace at the beginning of new lines as +appropriate, so you can start typing the next line of code right away. +However, some such editors do not remove the whitespace if you end up not +putting a line of code there, such as if you leave a blank line. As a result, +you end up with lines containing trailing whitespace. + +Git will warn you about patches that introduce trailing whitespace, and can +optionally strip the trailing whitespace for you; however, if applying a series +of patches, this may make later patches in the series fail by changing their +context lines. + Chapter 4: Naming @@ -621,12 +633,27 @@ covers RTL which is used frequently with assembly language in the kernel. Kernel developers like to be seen as literate. Do mind the spelling of kernel messages to make a good impression. Do not use crippled -words like "dont" and use "do not" or "don't" instead. +words like "dont"; use "do not" or "don't" instead. Make the messages +concise, clear, and unambiguous. Kernel messages do not have to be terminated with a period. Printing numbers in parentheses (%d) adds no value and should be avoided. +There are a number of driver model diagnostic macros in <linux/device.h> +which you should use to make sure messages are matched to the right device +and driver, and are tagged with the right level: dev_err(), dev_warn(), +dev_info(), and so forth. For messages that aren't associated with a +particular device, <linux/kernel.h> defines pr_debug() and pr_info(). + +Coming up with good debugging messages can be quite a challenge; and once +you have them, they can be a huge help for remote troubleshooting. Such +messages should be compiled out when the DEBUG symbol is not defined (that +is, by default they are not included). When you use dev_dbg() or pr_debug(), +that's automatic. Many subsystems have Kconfig options to turn on -DDEBUG. +A related convention uses VERBOSE_DEBUG to add dev_vdbg() messages to the +ones already enabled by DEBUG. + Chapter 14: Allocating memory @@ -726,6 +753,33 @@ need them. Feel free to peruse that header file to see what else is already defined that you shouldn't reproduce in your code. + Chapter 18: Editor modelines and other cruft + +Some editors can interpret configuration information embedded in source files, +indicated with special markers. For example, emacs interprets lines marked +like this: + +-*- mode: c -*- + +Or like this: + +/* +Local Variables: +compile-command: "gcc -DMAGIC_DEBUG_FLAG foo.c" +End: +*/ + +Vim interprets markers that look like this: + +/* vim:set sw=8 noet */ + +Do not include any of these in source files. People have their own personal +editor configurations, and your source files should not override them. This +includes markers for indentation and mode configuration. People may use their +own custom mode, or may have some other magic method for making indentation +work correctly. + + Appendix I: References @@ -751,4 +805,5 @@ Kernel CodingStyle, by greg@kroah.com at OLS 2002: http://www.kroah.com/linux/talks/ols_2002_kernel_codingstyle_talk/html/ -- -Last updated on 2006-December-06. +Last updated on 2007-July-13. + diff --git a/Documentation/DMA-mapping.txt b/Documentation/DMA-mapping.txt index 028614cdd062..e07f2530326b 100644 --- a/Documentation/DMA-mapping.txt +++ b/Documentation/DMA-mapping.txt @@ -664,109 +664,6 @@ It is that simple. Well, not for some odd devices. See the next section for information about that. - DAC Addressing for Address Space Hungry Devices - -There exists a class of devices which do not mesh well with the PCI -DMA mapping API. By definition these "mappings" are a finite -resource. The number of total available mappings per bus is platform -specific, but there will always be a reasonable amount. - -What is "reasonable"? Reasonable means that networking and block I/O -devices need not worry about using too many mappings. - -As an example of a problematic device, consider compute cluster cards. -They can potentially need to access gigabytes of memory at once via -DMA. Dynamic mappings are unsuitable for this kind of access pattern. - -To this end we've provided a small API by which a device driver -may use DAC cycles to directly address all of physical memory. -Not all platforms support this, but most do. It is easy to determine -whether the platform will work properly at probe time. - -First, understand that there may be a SEVERE performance penalty for -using these interfaces on some platforms. Therefore, you MUST only -use these interfaces if it is absolutely required. %99 of devices can -use the normal APIs without any problems. - -Note that for streaming type mappings you must either use these -interfaces, or the dynamic mapping interfaces above. You may not mix -usage of both for the same device. Such an act is illegal and is -guaranteed to put a banana in your tailpipe. - -However, consistent mappings may in fact be used in conjunction with -these interfaces. Remember that, as defined, consistent mappings are -always going to be SAC addressable. - -The first thing your driver needs to do is query the PCI platform -layer if it is capable of handling your devices DAC addressing -capabilities: - - int pci_dac_dma_supported(struct pci_dev *hwdev, u64 mask); - -You may not use the following interfaces if this routine fails. - -Next, DMA addresses using this API are kept track of using the -dma64_addr_t type. It is guaranteed to be big enough to hold any -DAC address the platform layer will give to you from the following -routines. If you have consistent mappings as well, you still -use plain dma_addr_t to keep track of those. - -All mappings obtained here will be direct. The mappings are not -translated, and this is the purpose of this dialect of the DMA API. - -All routines work with page/offset pairs. This is the _ONLY_ way to -portably refer to any piece of memory. If you have a cpu pointer -(which may be validly DMA'd too) you may easily obtain the page -and offset using something like this: - - struct page *page = virt_to_page(ptr); - unsigned long offset = offset_in_page(ptr); - -Here are the interfaces: - - dma64_addr_t pci_dac_page_to_dma(struct pci_dev *pdev, - struct page *page, - unsigned long offset, - int direction); - -The DAC address for the tuple PAGE/OFFSET are returned. The direction -argument is the same as for pci_{map,unmap}_single(). The same rules -for cpu/device access apply here as for the streaming mapping -interfaces. To reiterate: - - The cpu may touch the buffer before pci_dac_page_to_dma. - The device may touch the buffer after pci_dac_page_to_dma - is made, but the cpu may NOT. - -When the DMA transfer is complete, invoke: - - void pci_dac_dma_sync_single_for_cpu(struct pci_dev *pdev, - dma64_addr_t dma_addr, - size_t len, int direction); - -This must be done before the CPU looks at the buffer again. -This interface behaves identically to pci_dma_sync_{single,sg}_for_cpu(). - -And likewise, if you wish to let the device get back at the buffer after -the cpu has read/written it, invoke: - - void pci_dac_dma_sync_single_for_device(struct pci_dev *pdev, - dma64_addr_t dma_addr, - size_t len, int direction); - -before letting the device access the DMA area again. - -If you need to get back to the PAGE/OFFSET tuple from a dma64_addr_t -the following interfaces are provided: - - struct page *pci_dac_dma_to_page(struct pci_dev *pdev, - dma64_addr_t dma_addr); - unsigned long pci_dac_dma_to_offset(struct pci_dev *pdev, - dma64_addr_t dma_addr); - -This is possible with the DAC interfaces purely because they are -not translated in any way. - Optimizing Unmap State Space Consumption On many platforms, pci_unmap_{single,page}() is simply a nop. diff --git a/Documentation/DocBook/Makefile b/Documentation/DocBook/Makefile index 6fd1646d3204..08687e45e19d 100644 --- a/Documentation/DocBook/Makefile +++ b/Documentation/DocBook/Makefile @@ -15,11 +15,11 @@ DOCBOOKS := wanbook.xml z8530book.xml mcabook.xml videobook.xml \ ### # The build process is as follows (targets): -# (xmldocs) -# file.tmpl --> file.xml +--> file.ps (psdocs) -# +--> file.pdf (pdfdocs) -# +--> DIR=file (htmldocs) -# +--> man/ (mandocs) +# (xmldocs) [by docproc] +# file.tmpl --> file.xml +--> file.ps (psdocs) [by db2ps or xmlto] +# +--> file.pdf (pdfdocs) [by db2pdf or xmlto] +# +--> DIR=file (htmldocs) [by xmlto] +# +--> man/ (mandocs) [by xmlto] # for PDF and PS output you can choose between xmlto and docbook-utils tools diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl index 38f88b6ae405..eb42bf9847cb 100644 --- a/Documentation/DocBook/kernel-api.tmpl +++ b/Documentation/DocBook/kernel-api.tmpl @@ -139,8 +139,10 @@ X!Ilib/string.c !Elib/cmdline.c </sect1> - <sect1><title>CRC Functions</title> + <sect1 id="crc"><title>CRC Functions</title> +!Elib/crc7.c !Elib/crc16.c +!Elib/crc-itu-t.c !Elib/crc32.c !Elib/crc-ccitt.c </sect1> @@ -157,7 +159,6 @@ X!Ilib/string.c !Earch/i386/lib/usercopy.c </sect1> <sect1><title>More Memory Management Functions</title> -!Iinclude/linux/rmap.h !Emm/readahead.c !Emm/filemap.c !Emm/memory.c @@ -406,6 +407,10 @@ X!Edrivers/pnp/system.c !Edrivers/pnp/manager.c !Edrivers/pnp/support.c </sect1> + <sect1><title>Userspace IO devices</title> +!Edrivers/uio/uio.c +!Iinclude/linux/uio_driver.h + </sect1> </chapter> <chapter id="blkdev"> @@ -643,4 +648,70 @@ X!Idrivers/video/console/fonts.c !Edrivers/spi/spi.c </chapter> + <chapter id="i2c"> + <title>I<superscript>2</superscript>C and SMBus Subsystem</title> + + <para> + I<superscript>2</superscript>C (or without fancy typography, "I2C") + is an acronym for the "Inter-IC" bus, a simple bus protocol which is + widely used where low data rate communications suffice. + Since it's also a licensed trademark, some vendors use another + name (such as "Two-Wire Interface", TWI) for the same bus. + I2C only needs two signals (SCL for clock, SDA for data), conserving + board real estate and minimizing signal quality issues. + Most I2C devices use seven bit addresses, and bus speeds of up + to 400 kHz; there's a high speed extension (3.4 MHz) that's not yet + found wide use. + I2C is a multi-master bus; open drain signaling is used to + arbitrate between masters, as well as to handshake and to + synchronize clocks from slower clients. + </para> + + <para> + The Linux I2C programming interfaces support only the master + side of bus interactions, not the slave side. + The programming interface is structured around two kinds of driver, + and two kinds of device. + An I2C "Adapter Driver" abstracts the controller hardware; it binds + to a physical device (perhaps a PCI device or platform_device) and + exposes a <structname>struct i2c_adapter</structname> representing + each I2C bus segment it manages. + On each I2C bus segment will be I2C devices represented by a + <structname>struct i2c_client</structname>. Those devices will + be bound to a <structname>struct i2c_driver</structname>, + which should follow the standard Linux driver model. + (At this writing, a legacy model is more widely used.) + There are functions to perform various I2C protocol operations; at + this writing all such functions are usable only from task context. + </para> + + <para> + The System Management Bus (SMBus) is a sibling protocol. Most SMBus + systems are also I2C conformant. The electrical constraints are + tighter for SMBus, and it standardizes particular protocol messages + and idioms. Controllers that support I2C can also support most + SMBus operations, but SMBus controllers don't support all the protocol + options that an I2C controller will. + There are functions to perform various SMBus protocol operations, + either using I2C primitives or by issuing SMBus commands to + i2c_adapter devices which don't support those I2C operations. + </para> + +!Iinclude/linux/i2c.h +!Fdrivers/i2c/i2c-boardinfo.c i2c_register_board_info +!Edrivers/i2c/i2c-core.c + </chapter> + + <chapter id="splice"> + <title>splice API</title> + <para>) + splice is a method for moving blocks of data around inside the + kernel, without continually transferring it between the kernel + and user space. + </para> +!Iinclude/linux/splice.h +!Ffs/splice.c + </chapter> + + </book> diff --git a/Documentation/DocBook/procfs-guide.tmpl b/Documentation/DocBook/procfs-guide.tmpl index 45cad23efefa..2de84dc195a8 100644 --- a/Documentation/DocBook/procfs-guide.tmpl +++ b/Documentation/DocBook/procfs-guide.tmpl @@ -352,49 +352,93 @@ entry->write_proc = write_proc_foo; <funcsynopsis> <funcprototype> <funcdef>int <function>read_func</function></funcdef> - <paramdef>char* <parameter>page</parameter></paramdef> + <paramdef>char* <parameter>buffer</parameter></paramdef> <paramdef>char** <parameter>start</parameter></paramdef> <paramdef>off_t <parameter>off</parameter></paramdef> <paramdef>int <parameter>count</parameter></paramdef> - <paramdef>int* <parameter>eof</parameter></paramdef> + <paramdef>int* <parameter>peof</parameter></paramdef> <paramdef>void* <parameter>data</parameter></paramdef> </funcprototype> </funcsynopsis> <para> The read function should write its information into the - <parameter>page</parameter>. For proper use, the function - should start writing at an offset of - <parameter>off</parameter> in <parameter>page</parameter> and - write at most <parameter>count</parameter> bytes, but because - most read functions are quite simple and only return a small - amount of information, these two parameters are usually - ignored (it breaks pagers like <literal>more</literal> and - <literal>less</literal>, but <literal>cat</literal> still - works). + <parameter>buffer</parameter>, which will be exactly + <literal>PAGE_SIZE</literal> bytes long. </para> <para> - If the <parameter>off</parameter> and - <parameter>count</parameter> parameters are properly used, - <parameter>eof</parameter> should be used to signal that the + The parameter + <parameter>peof</parameter> should be used to signal that the end of the file has been reached by writing <literal>1</literal> to the memory location - <parameter>eof</parameter> points to. + <parameter>peof</parameter> points to. </para> <para> - The parameter <parameter>start</parameter> doesn't seem to be - used anywhere in the kernel. The <parameter>data</parameter> + The <parameter>data</parameter> parameter can be used to create a single call back function for several files, see <xref linkend="usingdata"/>. </para> <para> - The <function>read_func</function> function must return the - number of bytes written into the <parameter>page</parameter>. + The rest of the parameters and the return value are described + by a comment in <filename>fs/proc/generic.c</filename> as follows: </para> + <blockquote> + <para> + You have three ways to return data: + </para> + <orderedlist> + <listitem> + <para> + Leave <literal>*start = NULL</literal>. (This is the default.) + Put the data of the requested offset at that + offset within the buffer. Return the number (<literal>n</literal>) + of bytes there are from the beginning of the + buffer up to the last byte of data. If the + number of supplied bytes (<literal>= n - offset</literal>) is + greater than zero and you didn't signal eof + and the reader is prepared to take more data + you will be called again with the requested + offset advanced by the number of bytes + absorbed. This interface is useful for files + no larger than the buffer. + </para> + </listitem> + <listitem> + <para> + Set <literal>*start</literal> to an unsigned long value less than + the buffer address but greater than zero. + Put the data of the requested offset at the + beginning of the buffer. Return the number of + bytes of data placed there. If this number is + greater than zero and you didn't signal eof + and the reader is prepared to take more data + you will be called again with the requested + offset advanced by <literal>*start</literal>. This interface is + useful when you have a large file consisting + of a series of blocks which you want to count + and return as wholes. + (Hack by Paul.Russell@rustcorp.com.au) + </para> + </listitem> + <listitem> + <para> + Set <literal>*start</literal> to an address within the buffer. + Put the data of the requested offset at <literal>*start</literal>. + Return the number of bytes of data placed there. + If this number is greater than zero and you + didn't signal eof and the reader is prepared to + take more data you will be called again with the + requested offset advanced by the number of bytes + absorbed. + </para> + </listitem> + </orderedlist> + </blockquote> + <para> <xref linkend="example"/> shows how to use a read call back function. diff --git a/Documentation/DocBook/uio-howto.tmpl b/Documentation/DocBook/uio-howto.tmpl new file mode 100644 index 000000000000..e3bb29a8d8dd --- /dev/null +++ b/Documentation/DocBook/uio-howto.tmpl @@ -0,0 +1,611 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" +"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" []> + +<book id="index"> +<bookinfo> +<title>The Userspace I/O HOWTO</title> + +<author> + <firstname>Hans-Jürgen</firstname> + <surname>Koch</surname> + <authorblurb><para>Linux developer, Linutronix</para></authorblurb> + <affiliation> + <orgname> + <ulink url="http://www.linutronix.de">Linutronix</ulink> + </orgname> + + <address> + <email>hjk@linutronix.de</email> + </address> + </affiliation> +</author> + +<pubdate>2006-12-11</pubdate> + +<abstract> + <para>This HOWTO describes concept and usage of Linux kernel's + Userspace I/O system.</para> +</abstract> + +<revhistory> + <revision> + <revnumber>0.3</revnumber> + <date>2007-04-29</date> + <authorinitials>hjk</authorinitials> + <revremark>Added section about userspace drivers.</revremark> + </revision> + <revision> + <revnumber>0.2</revnumber> + <date>2007-02-13</date> + <authorinitials>hjk</authorinitials> + <revremark>Update after multiple mappings were added.</revremark> + </revision> + <revision> + <revnumber>0.1</revnumber> + <date>2006-12-11</date> + <authorinitials>hjk</authorinitials> + <revremark>First draft.</revremark> + </revision> +</revhistory> +</bookinfo> + +<chapter id="aboutthisdoc"> +<?dbhtml filename="about.html"?> +<title>About this document</title> + +<sect1 id="copyright"> +<?dbhtml filename="copyright.html"?> +<title>Copyright and License</title> +<para> + Copyright (c) 2006 by Hans-Jürgen Koch.</para> +<para> +This documentation is Free Software licensed under the terms of the +GPL version 2. +</para> +</sect1> + +<sect1 id="translations"> +<?dbhtml filename="translations.html"?> +<title>Translations</title> + +<para>If you know of any translations for this document, or you are +interested in translating it, please email me +<email>hjk@linutronix.de</email>. +</para> +</sect1> + +<sect1 id="preface"> +<title>Preface</title> + <para> + For many types of devices, creating a Linux kernel driver is + overkill. All that is really needed is some way to handle an + interrupt and provide access to the memory space of the + device. The logic of controlling the device does not + necessarily have to be within the kernel, as the device does + not need to take advantage of any of other resources that the + kernel provides. One such common class of devices that are + like this are for industrial I/O cards. + </para> + <para> + To address this situation, the userspace I/O system (UIO) was + designed. For typical industrial I/O cards, only a very small + kernel module is needed. The main part of the driver will run in + user space. This simplifies development and reduces the risk of + serious bugs within a kernel module. + </para> +</sect1> + +<sect1 id="thanks"> +<title>Acknowledgments</title> + <para>I'd like to thank Thomas Gleixner and Benedikt Spranger of + Linutronix, who have not only written most of the UIO code, but also + helped greatly writing this HOWTO by giving me all kinds of background + information.</para> +</sect1> + +<sect1 id="feedback"> +<title>Feedback</title> + <para>Find something wrong with this document? (Or perhaps something + right?) I would love to hear from you. Please email me at + <email>hjk@linutronix.de</email>.</para> +</sect1> +</chapter> + +<chapter id="about"> +<?dbhtml filename="about.html"?> +<title>About UIO</title> + +<para>If you use UIO for your card's driver, here's what you get:</para> + +<itemizedlist> +<listitem> + <para>only one small kernel module to write and maintain.</para> +</listitem> +<listitem> + <para>develop the main part of your driver in user space, + with all the tools and libraries you're used to.</para> +</listitem> +<listitem> + <para>bugs in your driver won't crash the kernel.</para> +</listitem> +<listitem> + <para>updates of your driver can take place without recompiling + the kernel.</para> +</listitem> +<listitem> + <para>if you need to keep some parts of your driver closed source, + you can do so without violating the GPL license on the kernel.</para> +</listitem> +</itemizedlist> + +<sect1 id="how_uio_works"> +<title>How UIO works</title> + <para> + Each UIO device is accessed through a device file and several + sysfs attribute files. The device file will be called + <filename>/dev/uio0</filename> for the first device, and + <filename>/dev/uio1</filename>, <filename>/dev/uio2</filename> + and so on for subsequent devices. + </para> + + <para><filename>/dev/uioX</filename> is used to access the + address space of the card. Just use + <function>mmap()</function> to access registers or RAM + locations of your card. + </para> + + <para> + Interrupts are handled by reading from + <filename>/dev/uioX</filename>. A blocking + <function>read()</function> from + <filename>/dev/uioX</filename> will return as soon as an + interrupt occurs. You can also use + <function>select()</function> on + <filename>/dev/uioX</filename> to wait for an interrupt. The + integer value read from <filename>/dev/uioX</filename> + represents the total interrupt count. You can use this number + to figure out if you missed some interrupts. + </para> + + <para> + To handle interrupts properly, your custom kernel module can + provide its own interrupt handler. It will automatically be + called by the built-in handler. + </para> + + <para> + For cards that don't generate interrupts but need to be + polled, there is the possibility to set up a timer that + triggers the interrupt handler at configurable time intervals. + See <filename>drivers/uio/uio_dummy.c</filename> for an + example of this technique. + </para> + + <para> + Each driver provides attributes that are used to read or write + variables. These attributes are accessible through sysfs + files. A custom kernel driver module can add its own + attributes to the device owned by the uio driver, but not added + to the UIO device itself at this time. This might change in the + future if it would be found to be useful. + </para> + + <para> + The following standard attributes are provided by the UIO + framework: + </para> +<itemizedlist> +<listitem> + <para> + <filename>name</filename>: The name of your device. It is + recommended to use the name of your kernel module for this. + </para> +</listitem> +<listitem> + <para> + <filename>version</filename>: A version string defined by your + driver. This allows the user space part of your driver to deal + with different versions of the kernel module. + </para> +</listitem> +<listitem> + <para> + <filename>event</filename>: The total number of interrupts + handled by the driver since the last time the device node was + read. + </para> +</listitem> +</itemizedlist> +<para> + These attributes appear under the + <filename>/sys/class/uio/uioX</filename> directory. Please + note that this directory might be a symlink, and not a real + directory. Any userspace code that accesses it must be able + to handle this. +</para> +<para> + Each UIO device can make one or more memory regions available for + memory mapping. This is necessary because some industrial I/O cards + require access to more than one PCI memory region in a driver. +</para> +<para> + Each mapping has its own directory in sysfs, the first mapping + appears as <filename>/sys/class/uio/uioX/maps/map0/</filename>. + Subsequent mappings create directories <filename>map1/</filename>, + <filename>map2/</filename>, and so on. These directories will only + appear if the size of the mapping is not 0. +</para> +<para> + Each <filename>mapX/</filename> directory contains two read-only files + that show start address and size of the memory: +</para> +<itemizedlist> +<listitem> + <para> + <filename>addr</filename>: The address of memory that can be mapped. + </para> +</listitem> +<listitem> + <para> + <filename>size</filename>: The size, in bytes, of the memory + pointed to by addr. + </para> +</listitem> +</itemizedlist> + +<para> + From userspace, the different mappings are distinguished by adjusting + the <varname>offset</varname> parameter of the + <function>mmap()</function> call. To map the memory of mapping N, you + have to use N times the page size as your offset: +</para> +<programlisting format="linespecific"> +offset = N * getpagesize(); +</programlisting> + +</sect1> +</chapter> + +<chapter id="using-uio_dummy" xreflabel="Using uio_dummy"> +<?dbhtml filename="using-uio_dummy.html"?> +<title>Using uio_dummy</title> + <para> + Well, there is no real use for uio_dummy. Its only purpose is + to test most parts of the UIO system (everything except + hardware interrupts), and to serve as an example for the + kernel module that you will have to write yourself. + </para> + +<sect1 id="what_uio_dummy_does"> +<title>What uio_dummy does</title> + <para> + The kernel module <filename>uio_dummy.ko</filename> creates a + device that uses a timer to generate periodic interrupts. The + interrupt handler does nothing but increment a counter. The + driver adds two custom attributes, <varname>count</varname> + and <varname>freq</varname>, that appear under + <filename>/sys/devices/platform/uio_dummy/</filename>. + </para> + + <para> + The attribute <varname>count</varname> can be read and + written. The associated file + <filename>/sys/devices/platform/uio_dummy/count</filename> + appears as a normal text file and contains the total number of + timer interrupts. If you look at it (e.g. using + <function>cat</function>), you'll notice it is slowly counting + up. + </para> + + <para> + The attribute <varname>freq</varname> can be read and written. + The content of + <filename>/sys/devices/platform/uio_dummy/freq</filename> + represents the number of system timer ticks between two timer + interrupts. The default value of <varname>freq</varname> is + the value of the kernel variable <varname>HZ</varname>, which + gives you an interval of one second. Lower values will + increase the frequency. Try the following: + </para> +<programlisting format="linespecific"> +cd /sys/devices/platform/uio_dummy/ +echo 100 > freq +</programlisting> + <para> + Use <function>cat count</function> to see how the interrupt + frequency changes. + </para> +</sect1> +</chapter> + +<chapter id="custom_kernel_module" xreflabel="Writing your own kernel module"> +<?dbhtml filename="custom_kernel_module.html"?> +<title>Writing your own kernel module</title> + <para> + Please have a look at <filename>uio_dummy.c</filename> as an + example. The following paragraphs explain the different + sections of this file. + </para> + +<sect1 id="uio_info"> +<title>struct uio_info</title> + <para> + This structure tells the framework the details of your driver, + Some of the members are required, others are optional. + </para> + +<itemizedlist> +<listitem><para> +<varname>char *name</varname>: Required. The name of your driver as +it will appear in sysfs. I recommend using the name of your module for this. +</para></listitem> + +<listitem><para> +<varname>char *version</varname>: Required. This string appears in +<filename>/sys/class/uio/uioX/version</filename>. +</para></listitem> + +<listitem><para> +<varname>struct uio_mem mem[ MAX_UIO_MAPS ]</varname>: Required if you +have memory that can be mapped with <function>mmap()</function>. For each +mapping you need to fill one of the <varname>uio_mem</varname> structures. +See the description below for details. +</para></listitem> + +<listitem><para> +<varname>long irq</varname>: Required. If your hardware generates an +interrupt, it's your modules task to determine the irq number during +initialization. If you don't have a hardware generated interrupt but +want to trigger the interrupt handler in some other way, set +<varname>irq</varname> to <varname>UIO_IRQ_CUSTOM</varname>. The +uio_dummy module does this as it triggers the event mechanism in a timer +routine. If you had no interrupt at all, you could set +<varname>irq</varname> to <varname>UIO_IRQ_NONE</varname>, though this +rarely makes sense. +</para></listitem> + +<listitem><para> +<varname>unsigned long irq_flags</varname>: Required if you've set +<varname>irq</varname> to a hardware interrupt number. The flags given +here will be used in the call to <function>request_irq()</function>. +</para></listitem> + +<listitem><para> +<varname>int (*mmap)(struct uio_info *info, struct vm_area_struct +*vma)</varname>: Optional. If you need a special +<function>mmap()</function> function, you can set it here. If this +pointer is not NULL, your <function>mmap()</function> will be called +instead of the built-in one. +</para></listitem> + +<listitem><para> +<varname>int (*open)(struct uio_info *info, struct inode *inode) +</varname>: Optional. You might want to have your own +<function>open()</function>, e.g. to enable interrupts only when your +device is actually used. +</para></listitem> + +<listitem><para> +<varname>int (*release)(struct uio_info *info, struct inode *inode) +</varname>: Optional. If you define your own +<function>open()</function>, you will probably also want a custom +<function>release()</function> function. +</para></listitem> +</itemizedlist> + +<para> +Usually, your device will have one or more memory regions that can be mapped +to user space. For each region, you have to set up a +<varname>struct uio_mem</varname> in the <varname>mem[]</varname> array. +Here's a description of the fields of <varname>struct uio_mem</varname>: +</para> + +<itemizedlist> +<listitem><para> +<varname>int memtype</varname>: Required if the mapping is used. Set this to +<varname>UIO_MEM_PHYS</varname> if you you have physical memory on your +card to be mapped. Use <varname>UIO_MEM_LOGICAL</varname> for logical +memory (e.g. allocated with <function>kmalloc()</function>). There's also +<varname>UIO_MEM_VIRTUAL</varname> for virtual memory. +</para></listitem> + +<listitem><para> +<varname>unsigned long addr</varname>: Required if the mapping is used. +Fill in the address of your memory block. This address is the one that +appears in sysfs. +</para></listitem> + +<listitem><para> +<varname>unsigned long size</varname>: Fill in the size of the +memory block that <varname>addr</varname> points to. If <varname>size</varname> +is zero, the mapping is considered unused. Note that you +<emphasis>must</emphasis> initialize <varname>size</varname> with zero for +all unused mappings. +</para></listitem> + +<listitem><para> +<varname>void *internal_addr</varname>: If you have to access this memory +region from within your kernel module, you will want to map it internally by +using something like <function>ioremap()</function>. Addresses +returned by this function cannot be mapped to user space, so you must not +store it in <varname>addr</varname>. Use <varname>internal_addr</varname> +instead to remember such an address. +</para></listitem> +</itemizedlist> + +<para> +Please do not touch the <varname>kobj</varname> element of +<varname>struct uio_mem</varname>! It is used by the UIO framework +to set up sysfs files for this mapping. Simply leave it alone. +</para> +</sect1> + +<sect1 id="adding_irq_handler"> +<title>Adding an interrupt handler</title> + <para> + What you need to do in your interrupt handler depends on your + hardware and on how you want to handle it. You should try to + keep the amount of code in your kernel interrupt handler low. + If your hardware requires no action that you + <emphasis>have</emphasis> to perform after each interrupt, + then your handler can be empty.</para> <para>If, on the other + hand, your hardware <emphasis>needs</emphasis> some action to + be performed after each interrupt, then you + <emphasis>must</emphasis> do it in your kernel module. Note + that you cannot rely on the userspace part of your driver. Your + userspace program can terminate at any time, possibly leaving + your hardware in a state where proper interrupt handling is + still required. + </para> + + <para> + There might also be applications where you want to read data + from your hardware at each interrupt and buffer it in a piece + of kernel memory you've allocated for that purpose. With this + technique you could avoid loss of data if your userspace + program misses an interrupt. + </para> + + <para> + A note on shared interrupts: Your driver should support + interrupt sharing whenever this is possible. It is possible if + and only if your driver can detect whether your hardware has + triggered the interrupt or not. This is usually done by looking + at an interrupt status register. If your driver sees that the + IRQ bit is actually set, it will perform its actions, and the + handler returns IRQ_HANDLED. If the driver detects that it was + not your hardware that caused the interrupt, it will do nothing + and return IRQ_NONE, allowing the kernel to call the next + possible interrupt handler. + </para> + + <para> + If you decide not to support shared interrupts, your card + won't work in computers with no free interrupts. As this + frequently happens on the PC platform, you can save yourself a + lot of trouble by supporting interrupt sharing. + </para> +</sect1> + +</chapter> + +<chapter id="userspace_driver" xreflabel="Writing a driver in user space"> +<?dbhtml filename="userspace_driver.html"?> +<title>Writing a driver in userspace</title> + <para> + Once you have a working kernel module for your hardware, you can + write the userspace part of your driver. You don't need any special + libraries, your driver can be written in any reasonable language, + you can use floating point numbers and so on. In short, you can + use all the tools and libraries you'd normally use for writing a + userspace application. + </para> + +<sect1 id="getting_uio_information"> +<title>Getting information about your UIO device</title> + <para> + Information about all UIO devices is available in sysfs. The + first thing you should do in your driver is check + <varname>name</varname> and <varname>version</varname> to + make sure your talking to the right device and that its kernel + driver has the version you expect. + </para> + <para> + You should also make sure that the memory mapping you need + exists and has the size you expect. + </para> + <para> + There is a tool called <varname>lsuio</varname> that lists + UIO devices and their attributes. It is available here: + </para> + <para> + <ulink url="http://www.osadl.org/projects/downloads/UIO/user/"> + http://www.osadl.org/projects/downloads/UIO/user/</ulink> + </para> + <para> + With <varname>lsuio</varname> you can quickly check if your + kernel module is loaded and which attributes it exports. + Have a look at the manpage for details. + </para> + <para> + The source code of <varname>lsuio</varname> can serve as an + example for getting information about an UIO device. + The file <filename>uio_helper.c</filename> contains a lot of + functions you could use in your userspace driver code. + </para> +</sect1> + +<sect1 id="mmap_device_memory"> +<title>mmap() device memory</title> + <para> + After you made sure you've got the right device with the + memory mappings you need, all you have to do is to call + <function>mmap()</function> to map the device's memory + to userspace. + </para> + <para> + The parameter <varname>offset</varname> of the + <function>mmap()</function> call has a special meaning + for UIO devices: It is used to select which mapping of + your device you want to map. To map the memory of + mapping N, you have to use N times the page size as + your offset: + </para> +<programlisting format="linespecific"> + offset = N * getpagesize(); +</programlisting> + <para> + N starts from zero, so if you've got only one memory + range to map, set <varname>offset = 0</varname>. + A drawback of this technique is that memory is always + mapped beginning with its start address. + </para> +</sect1> + +<sect1 id="wait_for_interrupts"> +<title>Waiting for interrupts</title> + <para> + After you successfully mapped your devices memory, you + can access it like an ordinary array. Usually, you will + perform some initialization. After that, your hardware + starts working and will generate an interrupt as soon + as it's finished, has some data available, or needs your + attention because an error occured. + </para> + <para> + <filename>/dev/uioX</filename> is a read-only file. A + <function>read()</function> will always block until an + interrupt occurs. There is only one legal value for the + <varname>count</varname> parameter of + <function>read()</function>, and that is the size of a + signed 32 bit integer (4). Any other value for + <varname>count</varname> causes <function>read()</function> + to fail. The signed 32 bit integer read is the interrupt + count of your device. If the value is one more than the value + you read the last time, everything is OK. If the difference + is greater than one, you missed interrupts. + </para> + <para> + You can also use <function>select()</function> on + <filename>/dev/uioX</filename>. + </para> +</sect1> + +</chapter> + +<appendix id="app1"> +<title>Further information</title> +<itemizedlist> + <listitem><para> + <ulink url="http://www.osadl.org"> + OSADL homepage.</ulink> + </para></listitem> + <listitem><para> + <ulink url="http://www.linutronix.de"> + Linutronix homepage.</ulink> + </para></listitem> +</itemizedlist> +</appendix> + +</book> diff --git a/Documentation/HOWTO b/Documentation/HOWTO index ced9207bedcf..f8cc3f8ed152 100644 --- a/Documentation/HOWTO +++ b/Documentation/HOWTO @@ -249,6 +249,9 @@ process is as follows: release a new -rc kernel every week. - Process continues until the kernel is considered "ready", the process should last around 6 weeks. + - A list of known regressions present in each -rc release is + tracked at the following URI: + http://kernelnewbies.org/known_regressions It is worth mentioning what Andrew Morton wrote on the linux-kernel mailing list about kernel releases: @@ -322,39 +325,34 @@ kernel releases as described above. Here is a list of some of the different kernel trees available: git trees: - Kbuild development tree, Sam Ravnborg <sam@ravnborg.org> - kernel.org:/pub/scm/linux/kernel/git/sam/kbuild.git + git.kernel.org:/pub/scm/linux/kernel/git/sam/kbuild.git - ACPI development tree, Len Brown <len.brown@intel.com> - kernel.org:/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6.git + git.kernel.org:/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6.git - Block development tree, Jens Axboe <axboe@suse.de> - kernel.org:/pub/scm/linux/kernel/git/axboe/linux-2.6-block.git + git.kernel.org:/pub/scm/linux/kernel/git/axboe/linux-2.6-block.git - DRM development tree, Dave Airlie <airlied@linux.ie> - kernel.org:/pub/scm/linux/kernel/git/airlied/drm-2.6.git + git.kernel.org:/pub/scm/linux/kernel/git/airlied/drm-2.6.git - ia64 development tree, Tony Luck <tony.luck@intel.com> - kernel.org:/pub/scm/linux/kernel/git/aegl/linux-2.6.git - - - ieee1394 development tree, Jody McIntyre <scjody@modernduck.com> - kernel.org:/pub/scm/linux/kernel/git/scjody/ieee1394.git + git.kernel.org:/pub/scm/linux/kernel/git/aegl/linux-2.6.git - infiniband, Roland Dreier <rolandd@cisco.com> - kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git + git.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git - libata, Jeff Garzik <jgarzik@pobox.com> - kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev.git + git.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev.git - network drivers, Jeff Garzik <jgarzik@pobox.com> - kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git + git.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git - pcmcia, Dominik Brodowski <linux@dominikbrodowski.net> - kernel.org:/pub/scm/linux/kernel/git/brodo/pcmcia-2.6.git + git.kernel.org:/pub/scm/linux/kernel/git/brodo/pcmcia-2.6.git - SCSI, James Bottomley <James.Bottomley@SteelEye.com> - kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6.git - - Other git kernel trees can be found listed at http://kernel.org/git + git.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6.git quilt trees: - USB, PCI, Driver Core, and I2C, Greg Kroah-Hartman <gregkh@suse.de> @@ -362,6 +360,9 @@ Here is a list of some of the different kernel trees available: - x86-64, partly i386, Andi Kleen <ak@suse.de> ftp.firstfloor.org:/pub/ak/x86_64/quilt/ + Other kernel trees can be found listed at http://git.kernel.org/ and in + the MAINTAINERS file. + Bug Reporting ------------- diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt index f4dffadbcb00..42b01bc2e1b4 100644 --- a/Documentation/RCU/checklist.txt +++ b/Documentation/RCU/checklist.txt @@ -222,7 +222,15 @@ over a rather long period of time, but improvements are always welcome! deadlock as soon as the RCU callback happens to interrupt that acquisition's critical section. -13. SRCU (srcu_read_lock(), srcu_read_unlock(), and synchronize_srcu()) +13. RCU callbacks can be and are executed in parallel. In many cases, + the callback code simply wrappers around kfree(), so that this + is not an issue (or, more accurately, to the extent that it is + an issue, the memory-allocator locking handles it). However, + if the callbacks do manipulate a shared data structure, they + must use whatever locking or other synchronization is required + to safely access and/or modify that data structure. + +14. SRCU (srcu_read_lock(), srcu_read_unlock(), and synchronize_srcu()) may only be invoked from process context. Unlike other forms of RCU, it -is- permissible to block in an SRCU read-side critical section (demarked by srcu_read_lock() and srcu_read_unlock()), diff --git a/Documentation/SM501.txt b/Documentation/SM501.txt new file mode 100644 index 000000000000..3a1bd95d3767 --- /dev/null +++ b/Documentation/SM501.txt @@ -0,0 +1,66 @@ + SM501 Driver + ============ + +Copyright 2006, 2007 Simtec Electronics + +Core +---- + +The core driver in drivers/mfd provides common services for the +drivers which manage the specific hardware blocks. These services +include locking for common registers, clock control and resource +management. + +The core registers drivers for both PCI and generic bus based +chips via the platform device and driver system. + +On detection of a device, the core initialises the chip (which may +be specified by the platform data) and then exports the selected +peripheral set as platform devices for the specific drivers. + +The core re-uses the platform device system as the platform device +system provides enough features to support the drivers without the +need to create a new bus-type and the associated code to go with it. + + +Resources +--------- + +Each peripheral has a view of the device which is implicitly narrowed to +the specific set of resources that peripheral requires in order to +function correctly. + +The centralised memory allocation allows the driver to ensure that the +maximum possible resource allocation can be made to the video subsystem +as this is by-far the most resource-sensitive of the on-chip functions. + +The primary issue with memory allocation is that of moving the video +buffers once a display mode is chosen. Indeed when a video mode change +occurs the memory footprint of the video subsystem changes. + +Since video memory is difficult to move without changing the display +(unless sufficient contiguous memory can be provided for the old and new +modes simultaneously) the video driver fully utilises the memory area +given to it by aligning fb0 to the start of the area and fb1 to the end +of it. Any memory left over in the middle is used for the acceleration +functions, which are transient and thus their location is less critical +as it can be moved. + + +Configuration +------------- + +The platform device driver uses a set of platform data to pass +configurations through to the core and the subsidiary drivers +so that there can be support for more than one system carrying +an SM501 built into a single kernel image. + +The PCI driver assumes that the PCI card behaves as per the Silicon +Motion reference design. + +There is an errata (AB-5) affecting the selection of the +of the M1XCLK and M1CLK frequencies. These two clocks +must be sourced from the same PLL, although they can then +be divided down individually. If this is not set, then SM501 may +lock and hang the whole system. The driver will refuse to +attach if the PLL selection is different. diff --git a/Documentation/SubmitChecklist b/Documentation/SubmitChecklist index 6ebffb57e3db..19e7f65c269f 100644 --- a/Documentation/SubmitChecklist +++ b/Documentation/SubmitChecklist @@ -1,4 +1,4 @@ -Linux Kernel patch sumbittal checklist +Linux Kernel patch submission checklist ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Here are some basic things that developers should do if they want to see their @@ -9,7 +9,6 @@ Documentation/SubmittingPatches and elsewhere regarding submitting Linux kernel patches. - 1: Builds cleanly with applicable or modified CONFIG options =y, =m, and =n. No gcc warnings/errors, no linker warnings/errors. diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index d91125ab6f49..3f9a7912e69b 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches @@ -340,8 +340,32 @@ now, but you can do this to mark internal company procedures or just point out some special detail about the sign-off. +13) When to use Acked-by: -13) The canonical patch format +The Signed-off-by: tag indicates that the signer was involved in the +development of the patch, or that he/she was in the patch's delivery path. + +If a person was not directly involved in the preparation or handling of a +patch but wishes to signify and record their approval of it then they can +arrange to have an Acked-by: line added to the patch's changelog. + +Acked-by: is often used by the maintainer of the affected code when that +maintainer neither contributed to nor forwarded the patch. + +Acked-by: is not as formal as Signed-off-by:. It is a record that the acker +has at least reviewed the patch and has indicated acceptance. Hence patch +mergers will sometimes manually convert an acker's "yep, looks good to me" +into an Acked-by:. + +Acked-by: does not necessarily indicate acknowledgement of the entire patch. +For example, if a patch affects multiple subsystems and has an Acked-by: from +one subsystem maintainer then this usually indicates acknowledgement of just +the part which affects that maintainer's code. Judgement should be used here. + When in doubt people should refer to the original discussion in the mailing +list archives. + + +14) The canonical patch format The canonical patch subject line is: @@ -440,9 +464,25 @@ section Linus Computer Science 101. Nuff said. If your code deviates too much from this, it is likely to be rejected without further review, and without comment. +Once significant exception is when moving code from one file to +another in this case you should not modify the moved code at all in +the same patch which moves it. This clearly delineates the act of +moving the code and your changes. This greatly aids review of the +actual differences and allows tools to better track the history of +the code itself. + Check your patches with the patch style checker prior to submission -(scripts/checkpatch.pl). You should be able to justify all -violations that remain in your patch. +(scripts/checkpatch.pl). The style checker should be viewed as +a guide not as the final word. If your code looks better with +a violation then its probably best left alone. + +The checker reports at three levels: + - ERROR: things that are very likely to be wrong + - WARNING: things requiring careful review + - CHECK: things requiring thought + +You should be able to justify all violations that remain in your +patch. diff --git a/Documentation/accounting/getdelays.c b/Documentation/accounting/getdelays.c index 71acc28ed0d1..24c5aade8998 100644 --- a/Documentation/accounting/getdelays.c +++ b/Documentation/accounting/getdelays.c @@ -49,6 +49,7 @@ char name[100]; int dbg; int print_delays; int print_io_accounting; +int print_task_context_switch_counts; __u64 stime, utime; #define PRINTF(fmt, arg...) { \ @@ -195,7 +196,7 @@ void print_delayacct(struct taskstats *t) "IO %15s%15s\n" " %15llu%15llu\n" "MEM %15s%15s\n" - " %15llu%15llu\n\n", + " %15llu%15llu\n" "count", "real total", "virtual total", "delay total", t->cpu_count, t->cpu_run_real_total, t->cpu_run_virtual_total, t->cpu_delay_total, @@ -204,6 +205,14 @@ void print_delayacct(struct taskstats *t) "count", "delay total", t->swapin_count, t->swapin_delay_total); } +void task_context_switch_counts(struct taskstats *t) +{ + printf("\n\nTask %15s%15s\n" + " %15lu%15lu\n", + "voluntary", "nonvoluntary", + t->nvcsw, t->nivcsw); +} + void print_ioacct(struct taskstats *t) { printf("%s: read=%llu, write=%llu, cancelled_write=%llu\n", @@ -235,7 +244,7 @@ int main(int argc, char *argv[]) struct msgtemplate msg; while (1) { - c = getopt(argc, argv, "diw:r:m:t:p:vl"); + c = getopt(argc, argv, "qdiw:r:m:t:p:vl"); if (c < 0) break; @@ -248,6 +257,10 @@ int main(int argc, char *argv[]) printf("printing IO accounting\n"); print_io_accounting = 1; break; + case 'q': + printf("printing task/process context switch rates\n"); + print_task_context_switch_counts = 1; + break; case 'w': logfile = strdup(optarg); printf("write to file %s\n", logfile); @@ -389,6 +402,8 @@ int main(int argc, char *argv[]) print_delayacct((struct taskstats *) NLA_DATA(na)); if (print_io_accounting) print_ioacct((struct taskstats *) NLA_DATA(na)); + if (print_task_context_switch_counts) + task_context_switch_counts((struct taskstats *) NLA_DATA(na)); if (fd) { if (write(fd, NLA_DATA(na), na->nla_len) < 0) { err(1,"write error\n"); diff --git a/Documentation/accounting/taskstats-struct.txt b/Documentation/accounting/taskstats-struct.txt index 661c797eaf79..8aa7529f8258 100644 --- a/Documentation/accounting/taskstats-struct.txt +++ b/Documentation/accounting/taskstats-struct.txt @@ -22,6 +22,8 @@ There are three different groups of fields in the struct taskstats: /* Extended accounting fields end */ Their values are collected if CONFIG_TASK_XACCT is set. +4) Per-task and per-thread context switch count statistics + Future extension should add fields to the end of the taskstats struct, and should not change the relative position of each field within the struct. @@ -158,4 +160,8 @@ struct taskstats { /* Extended accounting fields end */ +4) Per-task and per-thread statistics + __u64 nvcsw; /* Context voluntary switch counter */ + __u64 nivcsw; /* Context involuntary switch counter */ + } diff --git a/Documentation/atomic_ops.txt b/Documentation/atomic_ops.txt index 2a63d5662a93..05851e9982ed 100644 --- a/Documentation/atomic_ops.txt +++ b/Documentation/atomic_ops.txt @@ -149,7 +149,7 @@ defined which accomplish this: void smp_mb__before_atomic_dec(void); void smp_mb__after_atomic_dec(void); void smp_mb__before_atomic_inc(void); - void smp_mb__after_atomic_dec(void); + void smp_mb__after_atomic_inc(void); For example, smp_mb__before_atomic_dec() can be used like so: diff --git a/Documentation/blackfin/kgdb.txt b/Documentation/blackfin/kgdb.txt new file mode 100644 index 000000000000..84f6a484ae9a --- /dev/null +++ b/Documentation/blackfin/kgdb.txt @@ -0,0 +1,155 @@ + A Simple Guide to Configure KGDB + + Sonic Zhang <sonic.zhang@analog.com> + Aug. 24th 2006 + + +This KGDB patch enables the kernel developer to do source level debugging on +the kernel for the Blackfin architecture. The debugging works over either the +ethernet interface or one of the uarts. Both software breakpoints and +hardware breakpoints are supported in this version. +http://docs.blackfin.uclinux.org/doku.php?id=kgdb + + +2 known issues: +1. This bug: + http://blackfin.uclinux.org/tracker/index.php?func=detail&aid=544&group_id=18&atid=145 + The GDB client for Blackfin uClinux causes incorrect values of local + variables to be displayed when the user breaks the running of kernel in GDB. +2. Because of a hardware bug in Blackfin 533 v1.0.3: + 05000067 - Watchpoints (Hardware Breakpoints) are not supported + Hardware breakpoints cannot be set properly. + + +Debug over Ethernet: + +1. Compile and install the cross platform version of gdb for blackfin, which + can be found at $(BINROOT)/bfin-elf-gdb. + +2. Apply this patch to the 2.6.x kernel. Select the menuconfig option under + "Kernel hacking" -> "Kernel debugging" -> "KGDB: kernel debug with remote gdb". + With this selected, option "Full Symbolic/Source Debugging support" and + "Compile the kernel with frame pointers" are also selected. + +3. Select option "KGDB: connect over (Ethernet)". Add "kgdboe=@target-IP/,@host-IP/" to + the option "Compiled-in Kernel Boot Parameter" under "Kernel hacking". + +4. Connect minicom to the serial port and boot the kernel image. + +5. Configure the IP "/> ifconfig eth0 target-IP" + +6. Start GDB client "bfin-elf-gdb vmlinux". + +7. Connect to the target "(gdb) target remote udp:target-IP:6443". + +8. Set software breakpoint "(gdb) break sys_open". + +9. Continue "(gdb) c". + +10. Run ls in the target console "/> ls". + +11. Breakpoint hits. "Breakpoint 1: sys_open(..." + +12. Display local variables and function paramters. + (*) This operation gives wrong results, see known issue 1. + +13. Single stepping "(gdb) si". + +14. Remove breakpoint 1. "(gdb) del 1" + +15. Set hardware breakpoint "(gdb) hbreak sys_open". + +16. Continue "(gdb) c". + +17. Run ls in the target console "/> ls". + +18. Hardware breakpoint hits. "Breakpoint 1: sys_open(...". + (*) This hardware breakpoint will not be hit, see known issue 2. + +19. Continue "(gdb) c". + +20. Interrupt the target in GDB "Ctrl+C". + +21. Detach from the target "(gdb) detach". + +22. Exit GDB "(gdb) quit". + + +Debug over the UART: + +1. Compile and install the cross platform version of gdb for blackfin, which + can be found at $(BINROOT)/bfin-elf-gdb. + +2. Apply this patch to the 2.6.x kernel. Select the menuconfig option under + "Kernel hacking" -> "Kernel debugging" -> "KGDB: kernel debug with remote gdb". + With this selected, option "Full Symbolic/Source Debugging support" and + "Compile the kernel with frame pointers" are also selected. + +3. Select option "KGDB: connect over (UART)". Set "KGDB: UART port number" to be + a different one from the console. Don't forget to change the mode of + blackfin serial driver to PIO. Otherwise kgdb works incorrectly on UART. + +4. If you want connect to kgdb when the kernel boots, enable + "KGDB: Wait for gdb connection early" + +5. Compile kernel. + +6. Connect minicom to the serial port of the console and boot the kernel image. + +7. Start GDB client "bfin-elf-gdb vmlinux". + +8. Set the baud rate in GDB "(gdb) set remotebaud 57600". + +9. Connect to the target on the second serial port "(gdb) target remote /dev/ttyS1". + +10. Set software breakpoint "(gdb) break sys_open". + +11. Continue "(gdb) c". + +12. Run ls in the target console "/> ls". + +13. A breakpoint is hit. "Breakpoint 1: sys_open(..." + +14. All other operations are the same as that in KGDB over Ethernet. + + +Debug over the same UART as console: + +1. Compile and install the cross platform version of gdb for blackfin, which + can be found at $(BINROOT)/bfin-elf-gdb. + +2. Apply this patch to the 2.6.x kernel. Select the menuconfig option under + "Kernel hacking" -> "Kernel debugging" -> "KGDB: kernel debug with remote gdb". + With this selected, option "Full Symbolic/Source Debugging support" and + "Compile the kernel with frame pointers" are also selected. + +3. Select option "KGDB: connect over UART". Set "KGDB: UART port number" to console. + Don't forget to change the mode of blackfin serial driver to PIO. + Otherwise kgdb works incorrectly on UART. + +4. If you want connect to kgdb when the kernel boots, enable + "KGDB: Wait for gdb connection early" + +5. Connect minicom to the serial port and boot the kernel image. + +6. (Optional) Ask target to wait for gdb connection by entering Ctrl+A. In minicom, you should enter Ctrl+A+A. + +7. Start GDB client "bfin-elf-gdb vmlinux". + +8. Set the baud rate in GDB "(gdb) set remotebaud 57600". + +9. Connect to the target "(gdb) target remote /dev/ttyS0". + +10. Set software breakpoint "(gdb) break sys_open". + +11. Continue "(gdb) c". Then enter Ctrl+C twice to stop GDB connection. + +12. Run ls in the target console "/> ls". Dummy string can be seen on the console. + +13. Then connect the gdb to target again. "(gdb) target remote /dev/ttyS0". + Now you will find a breakpoint is hit. "Breakpoint 1: sys_open(..." + +14. All other operations are the same as that in KGDB over Ethernet. The only + difference is that after continue command in GDB, please stop GDB + connection by 2 "Ctrl+C"s and connect again after breakpoints are hit or + Ctrl+A is entered. diff --git a/Documentation/block/barrier.txt b/Documentation/block/barrier.txt index a272c3db8094..7d279f2f5bb2 100644 --- a/Documentation/block/barrier.txt +++ b/Documentation/block/barrier.txt @@ -82,23 +82,12 @@ including draining and flushing. typedef void (prepare_flush_fn)(request_queue_t *q, struct request *rq); int blk_queue_ordered(request_queue_t *q, unsigned ordered, - prepare_flush_fn *prepare_flush_fn, - unsigned gfp_mask); - -int blk_queue_ordered_locked(request_queue_t *q, unsigned ordered, - prepare_flush_fn *prepare_flush_fn, - unsigned gfp_mask); - -The only difference between the two functions is whether or not the -caller is holding q->queue_lock on entry. The latter expects the -caller is holding the lock. + prepare_flush_fn *prepare_flush_fn); @q : the queue in question @ordered : the ordered mode the driver/device supports @prepare_flush_fn : this function should prepare @rq such that it flushes cache to physical medium when executed -@gfp_mask : gfp_mask used when allocating data structures - for ordered processing For example, SCSI disk driver's prepare_flush_fn looks like the following. @@ -106,9 +95,10 @@ following. static void sd_prepare_flush(request_queue_t *q, struct request *rq) { memset(rq->cmd, 0, sizeof(rq->cmd)); - rq->flags |= REQ_BLOCK_PC; + rq->cmd_type = REQ_TYPE_BLOCK_PC; rq->timeout = SD_TIMEOUT; rq->cmd[0] = SYNCHRONIZE_CACHE; + rq->cmd_len = 10; } The following seven ordered modes are supported. The following table diff --git a/Documentation/cachetlb.txt b/Documentation/cachetlb.txt index debf6813934a..866b76139420 100644 --- a/Documentation/cachetlb.txt +++ b/Documentation/cachetlb.txt @@ -253,7 +253,7 @@ Here are the routines, one by one: The first of these two routines is invoked after map_vm_area() has installed the page table entries. The second is invoked - before unmap_vm_area() deletes the page table entries. + before unmap_kernel_range() deletes the page table entries. There exists another whole class of cpu cache issues which currently require a whole different set of interfaces to handle properly. diff --git a/Documentation/cdrom/00-INDEX b/Documentation/cdrom/00-INDEX index 916dafe29d3f..433edf23dc49 100644 --- a/Documentation/cdrom/00-INDEX +++ b/Documentation/cdrom/00-INDEX @@ -2,32 +2,10 @@ - this file (info on CD-ROMs and Linux) Makefile - only used to generate TeX output from the documentation. -aztcd - - info on Aztech/Orchid/Okano/Wearnes/Conrad/CyCDROM driver. cdrom-standard.tex - LaTeX document on standardizing the CD-ROM programming interface. -cdu31a - - info on the Sony CDU31A/CDU33A CD-ROM driver. -cm206 - - info on the Philips/LMS cm206/cm260 CD-ROM driver. -gscd - - info on the Goldstar R420 CD-ROM driver. ide-cd - info on setting up and using ATAPI (aka IDE) CD-ROMs. -isp16 - - info on the CD-ROM interface on ISP16, MAD16 or Mozart sound card. -mcd - - info on limitations of standard Mitsumi CD-ROM driver. -mcdx - - info on improved Mitsumi CD-ROM driver. -optcd - - info on the Optics Storage 8000 AT CD-ROM driver packet-writing.txt - Info on the CDRW packet writing module -sbpcd - - info on the SoundBlaster/Panasonic CD-ROM interface driver. -sjcd - - info on the SANYO CDR-H94A CD-ROM interface driver. -sonycd535 - - info on the Sony CDU-535 (and 531) CD-ROM driver. diff --git a/Documentation/cdrom/aztcd b/Documentation/cdrom/aztcd deleted file mode 100644 index 6bf0290ef7ce..000000000000 --- a/Documentation/cdrom/aztcd +++ /dev/null @@ -1,822 +0,0 @@ -$Id: README.aztcd,v 2.60 1997/11/29 09:51:25 root Exp root $ - Readme-File Documentation/cdrom/aztcd - for - AZTECH CD-ROM CDA268-01A, ORCHID CD-3110, - OKANO/WEARNES CDD110, CONRAD TXC, CyCDROM CR520, CR540 - CD-ROM Drives - Version 2.6 and newer - (for other drives see 6.-8.) - -NOTE: THIS DRIVER WILL WORK WITH THE CD-ROM DRIVES LISTED, WHICH HAVE - A PROPRIETARY INTERFACE (implemented on a sound card or on an - ISA-AT-bus card). - IT WILL DEFINITELY NOT WORK WITH CD-ROM DRIVES WITH *IDE*-INTERFACE, - such as the Aztech CDA269-031SE !!! (The only known exceptions are - 'faked' IDE drives like the CyCDROM CR520ie which work with aztcd - under certain conditions, see 7.). IF YOU'RE USING A CD-ROM DRIVE - WITH IDE-INTERFACE, SOMETIMES ALSO CALLED ATAPI-COMPATIBLE, PLEASE - USE THE ide-cd.c DRIVER, WRITTEN BY MARK LORD AND SCOTT SNYDER ! - THE STANDARD-KERNEL 1.2.x NOW ALSO SUPPORTS IDE-CDROM-DRIVES, SEE THE - HARDDISK (!) SECTION OF make config, WHEN COMPILING A NEW KERNEL!!! ----------------------------------------------------------------------------- - -Contents of this file: - 1. NOTE - 2. INSTALLATION - 3. CONFIGURING YOUR KERNEL - 4. RECOMPILING YOUR KERNEL - 4.1 AZTCD AS A RUN-TIME LOADABLE MODULE - 4.2 CDROM CONNECTED TO A SOUNDCARD - 5. KNOWN PROBLEMS, FUTURE DEVELOPMENTS - 5.1 MULTISESSION SUPPORT - 5.2 STATUS RECOGNITION - 5.3 DOSEMU's CDROM SUPPORT - 6. BUG REPORTS - 7. OTHER DRIVES - 8. IF YOU DON'T SUCCEED ... DEBUGGING - 9. TECHNICAL HISTORY OF THE DRIVER - 10. ACKNOWLEDGMENTS - 11. PROGRAMMING ADD ONS: CDPLAY.C - APPENDIX: Source code of cdplay.c ----------------------------------------------------------------------------- - -1. NOTE -This software has been successfully in alpha and beta test and is part of -the standard kernel since kernel 1.1.8x since December 1994. It works with -AZTECH CDA268-01A, ORCHID CDS-3110, ORCHID/WEARNES CDD110 and CONRAD TXC -(Nr.99 31 23 -series 04) and has proven to be stable with kernel -versions 1.0.9 and newer. But with any software there still may be bugs in it. -So if you encounter problems, you are invited to help us improve this software. -Please send me a detailed bug report (see chapter BUG REPORTS). You are also -invited in helping us to increase the number of drives, which are supported. - -Please read the README-files carefully and always keep a backup copy of your -old kernel, in order to reboot if something goes wrong! - -2. INSTALLATION -The driver consists of a header file 'aztcd.h', which normally should reside -in /usr/src/linux/drivers/cdrom and the source code 'aztcd.c', which normally -resides in the same place. It uses /dev/aztcd (/dev/aztcd0 in some distri- -butions), which must be a valid block device with major number 29 and reside -in directory /dev. To mount a CD-ROM, your kernel needs to have the ISO9660- -filesystem support included. - -PLEASE NOTE: aztcd.c has been developed in parallel to the linux kernel, -which had and is having many major and minor changes which are not backward -compatible. Quite definitely aztcd.c version 1.80 and newer will NOT work -in kernels older than 1.3.33. So please always use the most recent version -of aztcd.c with the appropriate linux-kernel. - -3. CONFIGURING YOUR KERNEL -If your kernel is already configured for using the AZTECH driver you will -see the following message while Linux boots: - Aztech CD-ROM Init: DriverVersion=<version number> BaseAddress=<baseaddress> - Aztech CD-ROM Init: FirmwareVersion=<firmware version id of your I/O-card>>> - Aztech CD-ROM Init: <drive type> detected - Aztech CD-ROM Init: End -If the message looks different and you are sure to have a supported drive, -it may have a different base address. The Aztech driver does look for the -CD-ROM drive at the base address specified in aztcd.h at compile time. This -address can be overwritten by boot parameter aztcd=....You should reboot and -start Linux with boot parameter aztcd=<base address>, e.g. aztcd=0x320. If -you do not know the base address, start your PC with DOS and look at the boot -message of your CD-ROM's DOS driver. If that still does not help, use boot -parameter aztcd=<base address>,0x79 , this tells aztcd to try a little harder. -aztcd may be configured to use autoprobing the base address by recompiling -it (see chapter 4.). - -If the message looks correct, as user 'root' you should be able to mount the -drive by - mount -t iso9660 -r /dev/aztcd0 /mnt -and use it as any other filesystem. (If this does not work, check if -/dev/aztcd0 and /mnt do exist and create them, if necessary by doing - mknod /dev/aztcd0 b 29 0 - mkdir /mnt - -If you still get a different message while Linux boots or when you get the -message, that the ISO9660-filesystem is not supported by your kernel, when -you try to mount the CD-ROM drive, you have to recompile your kernel. - -If you do *not* have an Aztech/Orchid/Okano/Wearnes/TXC drive and want to -bypass drive detection during Linux boot up, start with boot parameter aztcd=0. - -Most distributions nowadays do contain a boot disk image containing aztcd. -Please note, that this driver will not work with IDE/ATAPI drives! With these -you must use ide-cd.c instead. - -4. RECOMPILING YOUR KERNEL -If your kernel is not yet configured for the AZTECH driver and the ISO9660- -filesystem, you have to recompile your kernel: - -- Edit aztcd.h to set the I/O-address to your I/O-Base address (AZT_BASE_ADDR), - the driver does not use interrupts or DMA, so if you are using an AZTECH - CD268, an ORCHID CD-3110 or ORCHID/WEARNES CDD110 that's the only item you - have to set up. If you have a soundcard, read chapter 4.2. - Users of other drives should read chapter OTHER DRIVES of this file. - You also can configure that address by kernel boot parameter aztcd=... -- aztcd may be configured to use autoprobing the base address by setting - AZT_BASE_ADDR to '-1'. In that case aztcd probes the addresses listed - under AZT_BASE_AUTO. But please remember, that autoprobing always may - incorrectly influence other hardware components too! -- There are some other points, which may be configured, e.g. auto-eject the - CD when unmounting a drive, tray locking etc., see aztcd.h for details. -- If you're using a linux kernel version prior to 2.1.0, in aztcd.h - uncomment the line '#define AZT_KERNEL_PRIOR_2_1' -- Build a new kernel, configure it for 'Aztech/Orchid/Okano/Wearnes support' - (if you want aztcd to be part of the kernel). Do not configure it for - 'Aztech... support', if you want to use aztcd as a run time loadable module. - But in any case you must have the ISO9660-filesystem included in your - kernel. -- Activate the new kernel, normally this is done by running LILO (don't for- - get to configure it before and to keep a copy of your old kernel in case - something goes wrong!). -- Reboot -- If you've included aztcd in your kernel, you now should see during boot - some messages like - Aztech CD-ROM Init: DriverVersion=<version number> BaseAddress=<baseaddress> - Aztech CD-ROM Init: FirmwareVersion=<firmware version id of your I/O-card> - Aztech CD-ROM Init: <drive type> detected - Aztech CD-ROM Init: End -- If you have not included aztcd in your kernel, but want to load aztcd as a - run time loadable module see 4.1. -- If the message looks correct, as user 'root' you should be able to mount - the drive by - mount -t iso9660 -r /dev/aztcd0 /mnt - and use it as any other filesystem. (If this does not work, check if - /dev/aztcd0 and /mnt do exist and create them, if necessary by doing - mknod /dev/aztcd0 b 29 0 - mkdir /mnt -- If this still does not help, see chapters OTHER DRIVES and DEBUGGING. - -4.1 AZTCD AS A RUN-TIME LOADABLE MODULE -If you do not need aztcd permanently, you can also load and remove the driver -during runtime via insmod and rmmod. To build aztcd as a loadable module you -must configure your kernel for AZTECH module support (answer 'm' when con- -figuring the kernel). Anyhow, you may run into problems, if the version of -your boot kernel is not the same than the source kernel version, from which -you create the modules. So rebuild your kernel, if necessary. - -Now edit the base address of your AZTECH interface card in -/usr/src/linux/drivers/cdrom/aztcd.h to the appropriate value. -aztcd may be configured to use autoprobing the base address by setting -AZT_BASE_ADDR to '-1'. In that case aztcd probes the addresses listed -under AZT_BASE_AUTO. But please remember, that autoprobing always may -incorrectly influence other hardware components too! -There are also some special features which may be configured, e.g. -auto-eject a CD when unmounting the drive etc; see aztcd.h for details. -Then change to /usr/src/linux and do a - make modules - make modules_install -After that you can run-time load the driver via - insmod /lib/modules/X.X.X/misc/aztcd.o -and remove it via rmmod aztcd. -If you did not set the correct base address in aztcd.h, you can also supply the -base address when loading the driver via - insmod /lib/modules/X.X.X/misc/aztcd.o aztcd=<base address> -Again specifying aztcd=-1 will cause autoprobing. -If you do not have the iso9660-filesystem in your boot kernel, you also have -to load it before you can mount the CDROM: - insmod /lib/modules/X.X.X/fs/isofs.o -The mount procedure works as described in 4. above. -(In all commands 'X.X.X' is the current linux kernel version number) - -4.2 CDROM CONNECTED TO A SOUNDCARD -Most soundcards do have a bus interface to the CDROM-drive. In many cases -this soundcard needs to be configured, before the CDROM can be used. This -configuration procedure consists of writing some kind of initialization -data to the soundcard registers. The AZTECH-CDROM driver in the moment does -only support one type of soundcard (SoundWave32). Users of other soundcards -should try to boot DOS first and let their DOS drivers initialize the -soundcard and CDROM, then warm boot (or use loadlin) their PC to start -Linux. -Support for the CDROM-interface of SoundWave32-soundcards is directly -implemented in the AZTECH driver. Please edit linux/drivers/cdrom/aztdc.h, -uncomment line '#define AZT_SW32' and set the appropriate value for -AZT_BASE_ADDR and AZT_SW32_BASE_ADDR. This support was tested with an Orchid -CDS-3110 connected to a SoundWave32. -If you want your soundcard to be supported, find out, how it needs to be -configured and mail me (see 6.) the appropriate information. - -5. KNOWN PROBLEMS, FUTURE DEVELOPMENTS -5.1 MULTISESSION SUPPORT -Multisession support for CD's still is a myth. I implemented and tested a basic -support for multisession and XA CDs, but I still have not enough CDs and appli- -cations to test it rigorously. So if you'd like to help me, please contact me -(Email address see below). As of version 1.4 and newer you can enable the -multisession support in aztcd.h by setting AZT_MULTISESSION to 1. Doing so -will cause the ISO9660-filesystem to deal with multisession CDs, ie. redirect -requests to the Table of Contents (TOC) information from the last session, -which contains the info of all previous sessions etc.. If you do set -AZT_MULTISESSION to 0, you can use multisession CDs anyway. In that case the -drive's firmware will do automatic redirection. For the ISO9660-filesystem any -multisession CD will then look like a 'normal' single session CD. But never- -theless the data of all sessions are viewable and accessible. So with practical- -ly all real world applications you won't notice the difference. But as future -applications may make use of advanced multisession features, I've started to -implement the interface for the ISO9660 multisession interface via ioctl -CDROMMULTISESSION. - -5.2 STATUS RECOGNITION -The drive status recognition does not work correctly in all cases. Changing -a disk or having the door open, when a drive is already mounted, is detected -by the Aztech driver itself, but nevertheless causes multiple read attempts -by the different layers of the ISO9660-filesystem driver, which finally timeout, -so you have to wait quite a little... But isn't it bad style to change a disk -in a mounted drive, anyhow ?! - -The driver uses busy wait in most cases for the drive handshake (macros -STEN_LOW and DTEN_LOW). I tested with a 486/DX2 at 66MHz and a Pentium at -60MHz and 90MHz. Whenever you use a much faster machine you are likely to get -timeout messages. In that case edit aztcd.h and increase the timeout value -AZT_TIMEOUT. - -For some 'slow' drive commands I implemented waiting with a timer waitqueue -(macro STEN_LOW_WAIT). If you get this timeout message, you may also edit -aztcd.h and increase the timeout value AZT_STATUS_DELAY. The waitqueue has -shown to be a little critical. If you get kernel panic messages, edit aztcd.c -and substitute STEN_LOW_WAIT by STEN_LOW. Busy waiting with STEN_LOW is more -stable, but also causes CPU overhead. - -5.3 DOSEMU's CD-ROM SUPPORT -With release 1.20 aztcd was modified to allow access to CD-ROMS when running -under dosemu-0.60.0 aztcd-versions before 1.20 are most likely to crash -Linux, when a CD-ROM is accessed under dosemu. This problem has partly been -fixed, but still when accessing a directory for the first time the system -might hang for some 30sec. So be patient, when using dosemu's CD-ROM support -in combination with aztcd :-) ! -This problem has now (July 1995) been fixed by a modification to dosemu's -CD-ROM driver. The new version came with dosemu-0.60.2, see dosemu's -README.CDROM. - -6. BUG REPORTS -Please send detailed bug reports and bug fixes via EMail to - - Werner.Zimmermann@fht-esslingen.de - -Please include a description of your CD-ROM drive type and interface card, -the exact firmware message during Linux bootup, the version number of the -AZTECH-CDROM-driver and the Linux kernel version. Also a description of your -system's other hardware could be of interest, especially microprocessor type, -clock frequency, other interface cards such as soundcards, ethernet adapter, -game cards etc.. - -I will try to collect the reports and make the necessary modifications from -time to time. I may also come back to you directly with some bug fixes and -ask you to do further testing and debugging. - -Editors of CD-ROMs are invited to send a 'cooperation' copy of their -CD-ROMs to the volunteers, who provided the CD-ROM support for Linux. My -snail mail address for such 'stuff' is - Prof. Dr. W. Zimmermann - Fachhochschule fuer Technik Esslingen - Fachbereich IT - Flandernstrasse 101 - D-73732 Esslingen - Germany - - -7. OTHER DRIVES -The following drives ORCHID CDS3110, OKANO CDD110, WEARNES CDD110 and Conrad -TXC Nr. 993123-series 04 nearly look the same as AZTECH CDA268-01A, especially -they seem to use the same command codes. So it was quite simple to make the -AZTECH driver work with these drives. - -Unfortunately I do not have any of these drives available, so I couldn't test -it myself. In some installations, it seems necessary to initialize the drive -with the DOS driver before (especially if combined with a sound card) and then -do a warm boot (CTRL-ALT-RESET) or start Linux from DOS, e.g. with 'loadlin'. - -If you do not succeed, read chapter DEBUGGING. Thanks in advance! - -Sorry for the inconvenience, but it is difficult to develop for hardware, -which you don't have available for testing. So if you like, please help us. - -If you do have a CyCDROM CR520ie thanks to Hilmar Berger's help your chances -are good, that it will work with aztcd. The CR520ie is sold as an IDE-drive -and really is connected to the IDE interface (primary at 0x1F0 or secondary -at 0x170, configured as slave, not as master). Nevertheless it is not ATAPI -compatible but still uses Aztech's command codes. - - -8. DEBUGGING : IF YOU DON'T SUCCEED, TRY THE FOLLOWING --reread the complete README file --make sure, that your drive is hardware configured for - transfer mode: polled - IRQ: not used - DMA: not used - Base Address: something like 300, 320 ... - You can check this, when you start the DOS driver, which came with your - drive. By appropriately configuring the drive and the DOS driver you can - check, whether your drive does operate in this mode correctly under DOS. If - it does not operate under DOS, it won't under Linux. - If your drive's base address is something like 0x170 or 0x1F0 (and it is - not a CyCDROM CR520ie or CR 940ie) you most likely are having an IDE/ATAPI- - compatible drive, which is not supported by aztcd.c, use ide-cd.c instead. - Make sure the Base Address is configured correctly in aztcd.h, also make - sure, that /dev/aztcd0 exists with the correct major number (compare it with - the entry in file /usr/include/linux/major.h for the Aztech drive). --insert a CD-ROM and close the tray --cold boot your PC (i.e. via the power on switch or the reset button) --if you start Linux via DOS, e.g. using loadlin, make sure, that the DOS - driver for the CD-ROM drive is not loaded (comment out the calling lines - in DOS' config.sys!) --look for the aztcd: init message during Linux init and note them exactly --log in as root and do a mount -t iso9660 /dev/aztcd0 /mnt --if you don't succeed in the first time, try several times. Try also to open - and close the tray, then mount again. Please note carefully all commands - you typed in and the aztcd-messages, which you get. --if you get an 'Aztech CD-ROM init: aborted' message, read the remarks about - the version string below. - -If this does not help, do the same with the following differences --start DOS before; make now sure, that the DOS driver for the CD-ROM is - loaded under DOS (i.e. uncomment it again in config.sys) --warm boot your PC (i.e. via CTRL-ALT-DEL) - if you have it, you can also start via loadlin (try both). - ... - Again note all commands and the aztcd-messages. - -If you see STEN_LOW or STEN_LOW_WAIT error messages, increase the timeout -values. - -If this still does not help, --look in aztcd.c for the lines #if 0 - #define AZT_TEST1 - ... - #endif - and substitute '#if 0' by '#if 1'. --recompile your kernel and repeat the above two procedures. You will now get - a bundle of debugging messages from the driver. Again note your commands - and the appropriate messages. If you have syslogd running, these messages - may also be found in syslogd's kernel log file. Nevertheless in some - installations syslogd does not yet run, when init() is called, thus look for - the aztcd-messages during init, before the login-prompt appears. - Then look in aztcd.c, to find out, what happened. The normal calling sequence - is: aztcd_init() during Linux bootup procedure init() - after doing a 'mount -t iso9660 /dev/aztcd0 /mnt' the normal calling sequence is - aztcd_open() -> Status 2c after cold reboot with CDROM or audio CD inserted - -> Status 8 after warm reboot with CDROM inserted - -> Status 2e after cold reboot with no disk, closed tray - -> Status 6e after cold reboot, mount with door open - aztUpdateToc() - aztGetDiskInfo() - aztGetQChannelInfo() repeated several times - aztGetToc() - aztGetQChannelInfo() repeated several times - a list of track information - do_aztcd_request() } - azt_transfer() } repeated several times - azt_poll } - Check, if there is a difference in the calling sequence or the status flags! - - There are a lot of other messages, eg. the ACMD-command code (defined in - aztcd.h), status info from the getAztStatus-command and the state sequence of - the finite state machine in azt_poll(). The most important are the status - messages, look how they are defined and try to understand, if they make - sense in the context where they appear. With a CD-ROM inserted the status - should always be 8, except in aztcd_open(). Try to open the tray, insert an - audio disk, insert no disk or reinsert the CD-ROM and check, if the status - bits change accordingly. The status bits are the most likely point, where - the drive manufacturers may implement changes. - -If you still don't succeed, a good point to start is to look in aztcd.c in -function aztcd_init, where the drive should be detected during init. Do the -following: --reboot the system with boot parameter 'aztcd=<your base address>,0x79'. With - parameter 0x79 most of the drive version detection is bypassed. After that - you should see the complete version string including leading and trailing - blanks during init. - Now adapt the statement - if ((result[1]=='A')&&(result[2]=='Z' ...) - in aztcd_init() to exactly match the first 3 or 4 letters you have seen. --Another point is the 'smart' card detection feature in aztcd_init(). Normally - the CD-ROM drive is ready, when aztcd_init is trying to read the version - string and a time consuming ACMD_SOFT_RESET command can be avoided. This is - detected by looking, if AFL_OP_OK can be read correctly. If the CD-ROM drive - hangs in some unknown state, e.g. because of an error before a warm start or - because you first operated under DOS, even the version string may be correct, - but the following commands will not. Then change the code in such a way, - that the ACMD_SOFT_RESET is issued in any case, by substituting the - if-statement 'if ( ...=AFL_OP_OK)' by 'if (1)'. - -If you succeed, please mail me the exact version string of your drive and -the code modifications, you have made together with a short explanation. -If you don't succeed, you may mail me the output of the debugging messages. -But remember, they are only useful, if they are exact and complete and you -describe in detail your hardware setup and what you did (cold/warm reboot, -with/without DOS, DOS-driver started/not started, which Linux-commands etc.) - - -9. TECHNICAL HISTORY OF THE DRIVER -The AZTECH-Driver is a rework of the Mitsumi-Driver. Four major items had to -be reworked: - -a) The Mitsumi drive does issue complete status information acknowledging -each command, the Aztech drive does only signal that the command was -processed. So whenever the complete status information is needed, an extra -ACMD_GET_STATUS command is issued. The handshake procedure for the drive -can be found in the functions aztSendCmd(), sendAztCmd() and getAztStatus(). - -b) The Aztech Drive does not have a ACMD_GET_DISK_INFO command, so the -necessary info about the number of tracks (firstTrack, lastTrack), disk -length etc. has to be read from the TOC in the lead in track (see function -aztGetDiskInfo()). - -c) Whenever data is read from the drive, the Mitsumi drive is started with a -command to read an indefinite (0xffffff) number of sectors. When the appropriate -number of sectors is read, the drive is stopped by a ACDM_STOP command. This -does not work with the Aztech drive. I did not find a way to stop it. The -stop and pause commands do only work in AUDIO mode but not in DATA mode. -Therefore I had to modify the 'finite state machine' in function azt_poll to -only read a certain number of sectors and then start a new read on demand. As I -have not completely understood, how the buffer/caching scheme of the Mitsumi -driver was implemented, I am not sure, if I have covered all cases correctly, -whenever you get timeout messages, the bug is most likely to be in that -function azt_poll() around switch(cmd) .... case ACD_S_DATA. - -d) I did not get information about changing drive mode. So I doubt, that the -code around function azt_poll() case AZT_S_MODE does work. In my test I have -not been able to switch to reading in raw mode. For reading raw mode, Aztech -uses a different command than for cooked mode, which I only have implemen- -ted in the ioctl-section but not in the section which is used by the ISO9660. - -The driver was developed on an AST PC with Intel 486/DX2, 8MB RAM, 340MB IDE -hard disk and on an AST PC with Intel Pentium 60MHz, 16MB RAM, 520MB IDE -running Linux kernel version 1.0.9 from the LST 1.8 Distribution. The kernel -was compiled with gcc.2.5.8. My CD-ROM drive is an Aztech CDA268-01A. My -drive says, that it has Firmware Version AZT26801A1.3. It came with an ISA-bus -interface card and works with polled I/O without DMA and without interrupts. -The code for all other drives was 'remote' tested and debugged by a number of -volunteers on the Internet. - -Points, where I feel that possible problems might be and all points where I -did not completely understand the drive's behaviour or trust my own code are -marked with /*???*/ in the source code. There are also some parts in the -Mitsumi driver, where I did not completely understand their code. - - -10. ACKNOWLEDGMENTS -Without the help of P.Bush, Aztech, who delivered technical information -about the Aztech Drive and without the help of E.Moenkeberg, GWDG, who did a -great job in analyzing the command structure of various CD-ROM drives, this -work would not have been possible. E.Moenkeberg was also a great help in -making the software 'kernel ready' and in answering many of the CDROM-related -questions in the newsgroups. He really is *the* Linux CD-ROM guru. Thanks -also to all the guys on the Internet, who collected valuable technical -information about CDROMs. - -Joe Nardone (joe@access.digex.net) was a patient tester even for my first -trial, which was more than slow, and made suggestions for code improvement. -Especially the 'finite state machine' azt_poll() was rewritten by Joe to get -clean C code and avoid the ugly 'gotos', which I copied from mcd.c. - -Robby Schirmer (schirmer@fmi.uni-passau.de) tested the audio stuff (ioctls) -and suggested a lot of patches for them. - -Joseph Piskor and Peter Nugent were the first users with the ORCHID CD3110 -and also were very patient with the problems which occurred. - -Reinhard Max delivered the information for the CDROM-interface of the -SoundWave32 soundcards. - -Jochen Kunz and Olaf Kaluza delivered the information for supporting Conrad's -TXC drive. - -Hilmar Berger delivered the patches for supporting CyCDROM CR520ie. - -Anybody, who is interested in these items should have a look at 'ftp.gwdg.de', -directory 'pub/linux/cdrom' and at 'ftp.cdrom.com', directory 'pub/cdrom'. - -11. PROGRAMMING ADD ONs: cdplay.c -You can use the ioctl-functions included in aztcd.c in your own programs. As -an example on how to do this, you will find a tiny CD Player for audio CDs -named 'cdplay.c'. It allows you to play audio CDs. You can play a specified -track, pause and resume or skip tracks forward and backwards. If you quit the -program without stopping the drive, playing is continued. You can also -(mis)use cdplay to read and hexdump data disks. You can find the code in the -APPENDIX of this file, which you should cut out with an editor and store in a -separate file 'cdplay.c'. To compile it and make it executable, do - gcc -s -Wall -O2 -L/usr/lib cdplay.c -o /usr/local/bin/cdplay # compiles it - chmod +755 /usr/local/bin/cdplay # makes it executable - ln -s /dev/aztcd0 /dev/cdrom # creates a link - (for /usr/lib substitute the top level directory, where your include files - reside, and for /usr/local/bin the directory, where you want the executable - binary to reside ) - -You have to set the correct permissions for cdplay *and* for /dev/mcd0 or -/dev/aztcd0 in order to use it. Remember, that you should not have /dev/cdrom -mounted, when you're playing audio CDs. - -This program is just a hack for testing the ioctl-functions in aztcd.c. I will -not maintain it, so if you run into problems, discard it or have a look into -the source code 'cdplay.c'. The program does only contain a minimum of user -protection and input error detection. If you use the commands in the wrong -order or if you try to read a CD at wrong addresses, you may get error messages -or even hang your machine. If you get STEN_LOW, STEN_LOW_WAIT or segment violation -error messages when using cdplay, after that, the system might not be stable -any more, so you'd better reboot. As the ioctl-functions run in kernel mode, -most normal Linux-multitasking protection features do not work. By using -uninitialized 'wild' pointers etc., it is easy to write to other users' data -and program areas, destroy kernel tables etc.. So if you experiment with ioctls -as always when you are doing systems programming and kernel hacking, you -should have a backup copy of your system in a safe place (and you also -should try restoring from a backup copy first)! - -A reworked and improved version called 'cdtester.c', which has yet more -features for testing CDROM-drives can be found in -Documentation/cdrom/sbpcd, written by E.Moenkeberg. - -Werner Zimmermann -Fachhochschule fuer Technik Esslingen -(EMail: Werner.Zimmermann@fht-esslingen.de) -October, 1997 - ---------------------------------------------------------------------------- -APPENDIX: Source code of cdplay.c - -/* Tiny Audio CD Player - - Copyright 1994, 1995, 1996 Werner Zimmermann (Werner.Zimmermann@fht-esslingen.de) - -This program originally was written to test the audio functions of the -AZTECH.CDROM-driver, but it should work with every CD-ROM drive. Before -using it, you should set a symlink from /dev/cdrom to your real CDROM -device. - -The GNU General Public License applies to this program. - -History: V0.1 W.Zimmermann: First release. Nov. 8, 1994 - V0.2 W.Zimmermann: Enhanced functionality. Nov. 9, 1994 - V0.3 W.Zimmermann: Additional functions. Nov. 28, 1994 - V0.4 W.Zimmermann: fixed some bugs. Dec. 17, 1994 - V0.5 W.Zimmermann: clean 'scanf' commands without compiler warnings - Jan. 6, 1995 - V0.6 W.Zimmermann: volume control (still experimental). Jan. 24, 1995 - V0.7 W.Zimmermann: read raw modified. July 26, 95 -*/ - -#include <stdio.h> -#include <ctype.h> -#include <sys/ioctl.h> -#include <sys/types.h> -#include <fcntl.h> -#include <unistd.h> -#include <linux/cdrom.h> -#include <linux/../../drivers/cdrom/aztcd.h> - -void help(void) -{ printf("Available Commands: STOP s EJECT/CLOSE e QUIT q\n"); - printf(" PLAY TRACK t PAUSE p RESUME r\n"); - printf(" NEXT TRACK n REPEAT LAST l HELP h\n"); - printf(" SUB CHANNEL c TRACK INFO i PLAY AT a\n"); - printf(" READ d READ RAW w VOLUME v\n"); -} - -int main(void) -{ int handle; - unsigned char command=' ', ini=0, first=1, last=1; - unsigned int cmd, i,j,k, arg1,arg2,arg3; - struct cdrom_ti ti; - struct cdrom_tochdr tocHdr; - struct cdrom_subchnl subchnl; - struct cdrom_tocentry entry; - struct cdrom_msf msf; - union { struct cdrom_msf msf; - unsigned char buf[CD_FRAMESIZE_RAW]; - } azt; - struct cdrom_volctrl volctrl; - - printf("\nMini-Audio CD-Player V0.72 (C) 1994,1995,1996 W.Zimmermann\n"); - handle=open("/dev/cdrom",O_RDWR); - ioctl(handle,CDROMRESUME); - - if (handle<=0) - { printf("Drive Error: already playing, no audio disk, door open\n"); - printf(" or no permission (you must be ROOT in order to use this program)\n"); - } - else - { help(); - while (1) - { printf("Type command (h = help): "); - scanf("%s",&command); - switch (command) - { case 'e': cmd=CDROMEJECT; - ioctl(handle,cmd); - break; - case 'p': if (!ini) - { printf("Command not allowed - play track first\n"); - } - else - { cmd=CDROMPAUSE; - if (ioctl(handle,cmd)) printf("Drive Error\n"); - } - break; - case 'r': if (!ini) - { printf("Command not allowed - play track first\n"); - } - else - { cmd=CDROMRESUME; - if (ioctl(handle,cmd)) printf("Drive Error\n"); - } - break; - case 's': cmd=CDROMPAUSE; - if (ioctl(handle,cmd)) printf("Drive error or already stopped\n"); - cmd=CDROMSTOP; - if (ioctl(handle,cmd)) printf("Drive error\n"); - break; - case 't': cmd=CDROMREADTOCHDR; - if (ioctl(handle,cmd,&tocHdr)) printf("Drive Error\n"); - first=tocHdr.cdth_trk0; - last= tocHdr.cdth_trk1; - if ((first==0)||(first>last)) - { printf ("--could not read TOC\n"); - } - else - { printf("--first track: %d --last track: %d --enter track number: ",first,last); - cmd=CDROMPLAYTRKIND; - scanf("%i",&arg1); - ti.cdti_trk0=arg1; - if (ti.cdti_trk0<first) ti.cdti_trk0=first; - if (ti.cdti_trk0>last) ti.cdti_trk0=last; - ti.cdti_ind0=0; - ti.cdti_trk1=last; - ti.cdti_ind1=0; - if (ioctl(handle,cmd,&ti)) printf("Drive Error\n"); - ini=1; - } - break; - case 'n': if (!ini++) - { if (ioctl(handle,CDROMREADTOCHDR,&tocHdr)) printf("Drive Error\n"); - first=tocHdr.cdth_trk0; - last= tocHdr.cdth_trk1; - ti.cdti_trk0=first-1; - } - if ((first==0)||(first>last)) - { printf ("--could not read TOC\n"); - } - else - { cmd=CDROMPLAYTRKIND; - if (++ti.cdti_trk0 > last) ti.cdti_trk0=last; - ti.cdti_ind0=0; - ti.cdti_trk1=last; - ti.cdti_ind1=0; - if (ioctl(handle,cmd,&ti)) printf("Drive Error\n"); - ini=1; - } - break; - case 'l': if (!ini++) - { if (ioctl(handle,CDROMREADTOCHDR,&tocHdr)) printf("Drive Error\n"); - first=tocHdr.cdth_trk0; - last= tocHdr.cdth_trk1; - ti.cdti_trk0=first+1; - } - if ((first==0)||(first>last)) - { printf ("--could not read TOC\n"); - } - else - { cmd=CDROMPLAYTRKIND; - if (--ti.cdti_trk0 < first) ti.cdti_trk0=first; - ti.cdti_ind0=0; - ti.cdti_trk1=last; - ti.cdti_ind1=0; - if (ioctl(handle,cmd,&ti)) printf("Drive Error\n"); - ini=1; - } - break; - case 'c': subchnl.cdsc_format=CDROM_MSF; - if (ioctl(handle,CDROMSUBCHNL,&subchnl)) - printf("Drive Error\n"); - else - { printf("AudioStatus:%s Track:%d Mode:%d MSF=%d:%d:%d\n", \ - subchnl.cdsc_audiostatus==CDROM_AUDIO_PLAY ? "PLAYING":"NOT PLAYING",\ - subchnl.cdsc_trk,subchnl.cdsc_adr, \ - subchnl.cdsc_absaddr.msf.minute, subchnl.cdsc_absaddr.msf.second, \ - subchnl.cdsc_absaddr.msf.frame); - } - break; - case 'i': if (!ini) - { printf("Command not allowed - play track first\n"); - } - else - { cmd=CDROMREADTOCENTRY; - printf("Track No.: "); - scanf("%d",&arg1); - entry.cdte_track=arg1; - if (entry.cdte_track<first) entry.cdte_track=first; - if (entry.cdte_track>last) entry.cdte_track=last; - entry.cdte_format=CDROM_MSF; - if (ioctl(handle,cmd,&entry)) - { printf("Drive error or invalid track no.\n"); - } - else - { printf("Mode %d Track, starts at %d:%d:%d\n", \ - entry.cdte_adr,entry.cdte_addr.msf.minute, \ - entry.cdte_addr.msf.second,entry.cdte_addr.msf.frame); - } - } - break; - case 'a': cmd=CDROMPLAYMSF; - printf("Address (min:sec:frame) "); - scanf("%d:%d:%d",&arg1,&arg2,&arg3); - msf.cdmsf_min0 =arg1; - msf.cdmsf_sec0 =arg2; - msf.cdmsf_frame0=arg3; - if (msf.cdmsf_sec0 > 59) msf.cdmsf_sec0 =59; - if (msf.cdmsf_frame0> 74) msf.cdmsf_frame0=74; - msf.cdmsf_min1=60; - msf.cdmsf_sec1=00; - msf.cdmsf_frame1=00; - if (ioctl(handle,cmd,&msf)) - { printf("Drive error or invalid address\n"); - } - break; -#ifdef AZT_PRIVATE_IOCTLS /*not supported by every CDROM driver*/ - case 'd': cmd=CDROMREADCOOKED; - printf("Address (min:sec:frame) "); - scanf("%d:%d:%d",&arg1,&arg2,&arg3); - azt.msf.cdmsf_min0 =arg1; - azt.msf.cdmsf_sec0 =arg2; - azt.msf.cdmsf_frame0=arg3; - if (azt.msf.cdmsf_sec0 > 59) azt.msf.cdmsf_sec0 =59; - if (azt.msf.cdmsf_frame0> 74) azt.msf.cdmsf_frame0=74; - if (ioctl(handle,cmd,&azt.msf)) - { printf("Drive error, invalid address or unsupported command\n"); - } - k=0; - getchar(); - for (i=0;i<128;i++) - { printf("%4d:",i*16); - for (j=0;j<16;j++) - { printf("%2x ",azt.buf[i*16+j]); - } - for (j=0;j<16;j++) - { if (isalnum(azt.buf[i*16+j])) - printf("%c",azt.buf[i*16+j]); - else - printf("."); - } - printf("\n"); - k++; - if (k>=20) - { printf("press ENTER to continue\n"); - getchar(); - k=0; - } - } - break; - case 'w': cmd=CDROMREADRAW; - printf("Address (min:sec:frame) "); - scanf("%d:%d:%d",&arg1,&arg2,&arg3); - azt.msf.cdmsf_min0 =arg1; - azt.msf.cdmsf_sec0 =arg2; - azt.msf.cdmsf_frame0=arg3; - if (azt.msf.cdmsf_sec0 > 59) azt.msf.cdmsf_sec0 =59; - if (azt.msf.cdmsf_frame0> 74) azt.msf.cdmsf_frame0=74; - if (ioctl(handle,cmd,&azt)) - { printf("Drive error, invalid address or unsupported command\n"); - } - k=0; - for (i=0;i<147;i++) - { printf("%4d:",i*16); - for (j=0;j<16;j++) - { printf("%2x ",azt.buf[i*16+j]); - } - for (j=0;j<16;j++) - { if (isalnum(azt.buf[i*16+j])) - printf("%c",azt.buf[i*16+j]); - else - printf("."); - } - printf("\n"); - k++; - if (k>=20) - { getchar(); - k=0; - } - } - break; -#endif - case 'v': cmd=CDROMVOLCTRL; - printf("--Channel 0 Left (0-255): "); - scanf("%d",&arg1); - printf("--Channel 1 Right (0-255): "); - scanf("%d",&arg2); - volctrl.channel0=arg1; - volctrl.channel1=arg2; - volctrl.channel2=0; - volctrl.channel3=0; - if (ioctl(handle,cmd,&volctrl)) - { printf("Drive error or unsupported command\n"); - } - break; - case 'q': if (close(handle)) printf("Drive Error: CLOSE\n"); - exit(0); - case 'h': help(); - break; - default: printf("unknown command\n"); - break; - } - } - } - return 0; -} diff --git a/Documentation/cdrom/cdu31a b/Documentation/cdrom/cdu31a deleted file mode 100644 index c0667da09c00..000000000000 --- a/Documentation/cdrom/cdu31a +++ /dev/null @@ -1,196 +0,0 @@ - - CDU31A/CDU33A Driver Info - ------------------------- - -Information on the Sony CDU31A/CDU33A CDROM driver for the Linux -kernel. - - Corey Minyard (minyard@metronet.com) - - Colossians 3:17 - -Crude Table of Contents ------------------------ - - Setting Up the Hardware - Configuring the Kernel - Configuring as a Module - Driver Special Features - - -This device driver handles Sony CDU31A/CDU33A CDROM drives and -provides a complete block-level interface as well as an ioctl() -interface as specified in include/linux/cdrom.h). With this -interface, CDROMs can be accessed, standard audio CDs can be played -back normally, and CD audio information can be read off the drive. - -Note that this will only work for CDU31A/CDU33A drives. Some vendors -market their drives as CDU31A compatible. They lie. Their drives are -really CDU31A hardware interface compatible (they can plug into the -same card). They are not software compatible. - -Setting Up the Hardware ------------------------ - -The CDU31A driver is unable to safely tell if an interface card is -present that it can use because the interface card does not announce -its presence in any way besides placing 4 I/O locations in memory. It -used to just probe memory and attempt commands, but Linus wisely asked -me to remove that because it could really screw up other hardware in -the system. - -Because of this, you must tell the kernel where the drive interface -is, what interrupts are used, and possibly if you are on a PAS-16 -soundcard. - -If you have the Sony CDU31A/CDU33A drive interface card, the following -diagram will help you set it up. If you have another card, you are on -your own. You need to make sure that the I/O address and interrupt is -not used by another card in the system. You will need to know the I/O -address and interrupt you have set. Note that use of interrupts is -highly recommended, if possible, it really cuts down on CPU used. -Unfortunately, most soundcards do not support interrupts for their -CDROM interfaces. By default, the Sony interface card comes with -interrupts disabled. - - +----------+-----------------+----------------------+ - | JP1 | 34 Pin Conn | | - | JP2 +-----------------+ | - | JP3 | - | JP4 | - | +--+ - | | +-+ - | | | | External - | | | | Connector - | | | | - | | +-+ - | +--+ - | | - | +--------+ - | | - +------------------------------------------+ - - JP1 sets the Base Address, using the following settings: - - Address Pin 1 Pin 2 - ------- ----- ----- - 0x320 Short Short - 0x330 Short Open - 0x340 Open Short - 0x360 Open Open - - JP2 and JP3 configure the DMA channel; they must be set the same. - - DMA Pin 1 Pin 2 Pin 3 - --- ----- ----- ----- - 1 On Off On - 2 Off On Off - 3 Off Off On - - JP4 Configures the IRQ: - - IRQ Pin 1 Pin 2 Pin 3 Pin 4 - --- ----- ----- ----- ----- - 3 Off Off On Off - 4 Off Off* Off On - 5 On Off Off Off - 6 Off On Off Off - - The documentation states to set this for interrupt - 4, but I think that is a mistake. - -Note that if you have another interface card, you will need to look at -the documentation to find the I/O base address. This is specified to -the SLCD.SYS driver for DOS with the /B: parameter, so you can look at -you DOS driver setup to find the address, if necessary. - -Configuring the Kernel ----------------------- - -You must tell the kernel where the drive is at boot time. This can be -done at the Linux boot prompt, by using LILO, or by using Bootlin. -Note that this is no substitute for HOWTOs and LILO documentation, if -you are confused please read those for info on bootline configuration -and LILO. - -At the linux boot prompt, press the ALT key and add the following line -after the boot name (you can let the kernel boot, it will tell you the -default boot name while booting): - - cdu31a=<base address>,<interrupt>[,PAS] - -The base address needs to have "0x" in front of it, since it is in -hex. For instance, to configure a drive at address 320 on interrupt 5, -use the following: - - cdu31a=0x320,5 - -I use the following boot line: - - cdu31a=0x1f88,0,PAS - -because I have a PAS-16 which does not support interrupt for the -CDU31A interface. - -Adding this as an append line at the beginning of the /etc/lilo.conf -file will set it for lilo configurations. I have the following as the -first line in my lilo.conf file: - - append="cdu31a=0x1f88,0" - -I'm not sure how to set up Bootlin (I have never used it), if someone -would like to fill in this section please do. - - -Configuring as a Module ------------------------ - -The driver supports loading as a module. However, you must specify -the boot address and interrupt on the boot line to insmod. You can't -use modprobe to load it, since modprobe doesn't support setting -variables. - -Anyway, I use the following line to load my driver as a module - - /sbin/insmod /lib/modules/`uname -r`/misc/cdu31a.o cdu31a_port=0x1f88 - -You can set the following variables in the driver: - - cdu31a_port=<I/O address> - sets the base I/O. If hex, put 0x in - front of it. This must be specified. - - cdu31a_irq=<interrupt> - Sets the interrupt number. Leaving this - off will turn interrupts off. - - -Driver Special Features ------------------------ - -This section describes features beyond the normal audio and CD-ROM -functions of the drive. - -2048 byte buffer mode - -If a disk is mounted with -o block=2048, data is copied straight from -the drive data port to the buffer. Otherwise, the readahead buffer -must be involved to hold the other 1K of data when a 1K block -operation is done. Note that with 2048 byte blocks you cannot execute -files from the CD. - -XA compatibility - -The driver should support XA disks for both the CDU31A and CDU33A. It -does this transparently, the using program doesn't need to set it. - -Multi-Session - -A multi-session disk looks just like a normal disk to the user. Just -mount one normally, and all the data should be there. A special -thanks to Koen for help with this! - -Raw sector I/O - -Using the CDROMREADAUDIO it is possible to read raw audio and data -tracks. Both operations return 2352 bytes per sector. On the data -tracks, the first 12 bytes is not returned by the drive and the value -of that data is indeterminate. diff --git a/Documentation/cdrom/cm206 b/Documentation/cdrom/cm206 deleted file mode 100644 index 810368f4f7c4..000000000000 --- a/Documentation/cdrom/cm206 +++ /dev/null @@ -1,185 +0,0 @@ -This is the readme file for the driver for the Philips/LMS cdrom drive -cm206 in combination with the cm260 host adapter card. - - (c) 1995 David A. van Leeuwen - -Changes since version 0.99 --------------------------- -- Interfacing to the kernel is routed though an extra interface layer, - cdrom.c. This allows runtime-configurable `behavior' of the cdrom-drive, - independent of the driver. - -Features since version 0.33 ---------------------------- -- Full audio support, that is, both workman, workbone and cdp work - now reasonably. Reading TOC still takes some time. xmcd has been - reported to run successfully. -- Made auto-probe code a little better, I hope - -Features since version 0.28 ---------------------------- -- Full speed transfer rate (300 kB/s). -- Minimum kernel memory usage for buffering (less than 3 kB). -- Multisession support. -- Tray locking. -- Statistics of driver accessible to the user. -- Module support. -- Auto-probing of adapter card's base port and irq line, - also configurable at boot time or module load time. - - -Decide how you are going to use the driver. There are two -options: - - (a) installing the driver as a resident part of the kernel - (b) compiling the driver as a loadable module - - Further, you must decide if you are going to specify the base port - address and the interrupt request line of the adapter card cm260 as - boot options for (a), module parameters for (b), use automatic - probing of these values, or hard-wire your adaptor card's settings - into the source code. If you don't care, you can choose - autoprobing, which is the default. In that case you can move on to - the next step. - -Compiling the kernel --------------------- -1) move to /usr/src/linux and do a - - make config - - If you have chosen option (a), answer yes to CONFIG_CM206 and - CONFIG_ISO9660_FS. - - If you have chosen option (b), answer yes to CONFIG_MODVERSIONS - and no (!) to CONFIG_CM206 and CONFIG_ISO9660_FS. - -2) then do a - - make clean; make zImage; make modules - -3) do the usual things to install a new image (backup the old one, run - `rdev -R zImage 1', copy the new image in place, run lilo). Might - be `make zlilo'. - -Using the driver as a module ----------------------------- -If you will only occasionally use the cd-rom driver, you can choose -option (b), install as a loadable module. You may have to re-compile -the module when you upgrade the kernel to a new version. - -Since version 0.96, much of the functionality has been transferred to -a generic cdrom interface in the file cdrom.c. The module cm206.o -depends on cdrom.o. If the latter is not compiled into the kernel, -you must explicitly load it before cm206.o: - - insmod /usr/src/linux/modules/cdrom.o - -To install the module, you use the command, as root - - insmod /usr/src/linux/modules/cm206.o - -You can specify the base address on the command line as well as the irq -line to be used, e.g. - - insmod /usr/src/linux/modules/cm206.o cm206=0x300,11 - -The order of base port and irq line doesn't matter; if you specify only -one, the other will have the value of the compiled-in default. You -may also have to install the file-system module `iso9660.o', if you -didn't compile that into the kernel. - - -Using the driver as part of the kernel --------------------------------------- -If you have chosen option (a), you can specify the base-port -address and irq on the lilo boot command line, e.g.: - - LILO: linux cm206=0x340,11 - -This assumes that your linux kernel image keyword is `linux'. -If you specify either IRQ (3--11) or base port (0x300--0x370), -auto probing is turned off for both settings, thus setting the -other value to the compiled-in default. - -Note that you can also put these parameters in the lilo configuration file: - -# linux config -image = /vmlinuz - root = /dev/hda1 - label = Linux - append = "cm206=0x340,11" - read-only - - -If module parameters and LILO config options don't work -------------------------------------------------------- -If autoprobing does not work, you can hard-wire the default values -of the base port address (CM206_BASE) and interrupt request line -(CM206_IRQ) into the file /usr/src/linux/drivers/cdrom/cm206.h. Change -the defines of CM206_IRQ and CM206_BASE. - - -Mounting the cdrom ------------------- -1) Make sure that the right device is installed in /dev. - - mknod /dev/cm206cd b 32 0 - -2) Make sure there is a mount point, e.g., /cdrom - - mkdir /cdrom - -3) mount using a command like this (run as root): - - mount -rt iso9660 /dev/cm206cd /cdrom - -4) For user-mounts, add a line in /etc/fstab - - /dev/cm206cd /cdrom iso9660 ro,noauto,user - - This will allow users to give the commands - - mount /cdrom - umount /cdrom - -If things don't work --------------------- - -- Try to do a `dmesg' to find out if the driver said anything about - what is going wrong during the initialization. - -- Try to do a `dd if=/dev/cm206cd | od -tc | less' to read from the - CD. - -- Look in the /proc directory to see if `cm206' shows up under one of - `interrupts', `ioports', `devices' or `modules' (if applicable). - - -DISCLAIMER ----------- -I cannot guarantee that this driver works, or that the hardware will -not be harmed, although I consider it most unlikely. - -I hope that you'll find this driver in some way useful. - - David van Leeuwen - david@tm.tno.nl - -Note for Linux CDROM vendors ------------------------------ -You are encouraged to include this driver on your Linux CDROM. If -you do, you might consider sending me a free copy of that cd-rom. -You can contact me through my e-mail address, david@tm.tno.nl. -If this driver is compiled into a kernel to boot off a cdrom, -you should actually send me a free copy of that cd-rom. - -Copyright ---------- -The copyright of the cm206 driver for Linux is - - (c) 1995 David A. van Leeuwen - -The driver is released under the conditions of the GNU general public -license, which can be found in the file COPYING in the root of this -source tree. diff --git a/Documentation/cdrom/gscd b/Documentation/cdrom/gscd deleted file mode 100644 index d01ca36b5c43..000000000000 --- a/Documentation/cdrom/gscd +++ /dev/null @@ -1,60 +0,0 @@ - Goldstar R420 CD-Rom device driver README - -For all kind of other information about the GoldStar R420 CDROM -and this Linux device driver see the WWW page: - - http://linux.rz.fh-hannover.de/~raupach - - - If you are the editor of a Linux CD, you should - enable gscd.c within your boot floppy kernel. Please, - send me one of your CDs for free. - - -This current driver version 0.4a only supports reading data from the disk. -Currently we have no audio and no multisession or XA support. -The polling interface is used, no DMA. - - -Sometimes the GoldStar R420 is sold in a 'Reveal Multimedia Kit'. This kit's -drive interface is compatible, too. - - -Installation ------------- - -Change to '/usr/src/linux/drivers/cdrom' and edit the file 'gscd.h'. Insert -the i/o address of your interface card. - -The default base address is 0x340. This will work for most applications. -Address selection is accomplished by jumpers PN801-1 to PN801-4 on the -GoldStar Interface Card. -Appropriate settings are: 0x300, 0x310, 0x320, 0x330, 0x340, 0x350, 0x360 -0x370, 0x380, 0x390, 0x3A0, 0x3B0, 0x3C0, 0x3D0, 0x3E0, 0x3F0 - -Then go back to '/usr/src/linux/' and 'make config' to build the new -configuration for your kernel. If you want to use the GoldStar driver -like a module, don't select 'GoldStar CDROM support'. By the way, you -have to include the iso9660 filesystem. - -Now start compiling the kernel with 'make zImage'. -If you want to use the driver as a module, you have to do 'make modules' -and 'make modules_install', additionally. -Install your new kernel as usual - maybe you do it with 'make zlilo'. - -Before you can use the driver, you have to - mknod /dev/gscd0 b 16 0 -to create the appropriate device file (you only need to do this once). - -If you use modules, you can try to insert the driver. -Say: 'insmod /usr/src/linux/modules/gscd.o' -or: 'insmod /usr/src/linux/modules/gscd.o gscd=<address>' -The driver should report its results. - -That's it! Mount a disk, i.e. 'mount -rt iso9660 /dev/gscd0 /cdrom' - -Feel free to report errors and suggestions to the following address. -Be sure, I'm very happy to receive your comments! - - Oliver Raupach Hannover, Juni 1995 -(raupach@nwfs1.rz.fh-hannover.de) diff --git a/Documentation/cdrom/isp16 b/Documentation/cdrom/isp16 deleted file mode 100644 index cc86533ac9f3..000000000000 --- a/Documentation/cdrom/isp16 +++ /dev/null @@ -1,100 +0,0 @@ - -- Documentation/cdrom/isp16 - -Docs by Eric van der Maarel <H.T.M.v.d.Maarel@marin.nl> - -This is the README for version 0.6 of the cdrom interface on an -ISP16, MAD16 or Mozart sound card. - -The detection and configuration of this interface used to be included -in both the sjcd and optcd cdrom driver. Drives supported by these -drivers came packed with Media Magic's multi media kit, which also -included the ISP16 card. The idea (thanks Leo Spiekman) -to move it from these drivers into a separate module and moreover, not to -rely on the MAD16 sound driver, are as follows: --duplication of code in the kernel is a waste of resources and should - be avoided; --however, kernels and notably those included with Linux distributions - (cf Slackware 3.0 included version 0.5 of the isp16 configuration - code included in the drivers) don't always come with sound support - included. Especially when they already include a bunch of cdrom drivers. - Hence, the cdrom interface should be configurable _independently_ of - sound support. - -The ISP16, MAD16 and Mozart sound cards have an OPTi 82C928 or an -OPTi 82C929 chip. The interface on these cards should work with -any cdrom attached to the card, which is 'electrically' compatible -with Sanyo/Panasonic, Sony or Mitsumi non-ide drives. However, the -command sets for any proprietary drives may differ -(and hence may not be supported in the kernel) from these four types. -For a fact I know the interface works and the way of configuration -as described in this documentation works in combination with the -sjcd (in Sanyo/Panasonic compatibility mode) cdrom drivers -(probably with the optcd (in Sony compatibility mode) as well). -If you have such an OPTi based sound card and you want to use the -cdrom interface with a cdrom drive supported by any of the other cdrom -drivers, it will probably work. Please let me know any experience you -might have). -I understand that cards based on the OPTi 82C929 chips may be configured -(hardware jumpers that is) as an IDE interface. Initialisation of such a -card in this mode is not supported (yet?). - -The suggestion to configure the ISP16 etc. sound card by booting DOS and -do a warm reboot to boot Linux somehow doesn't work, at least not -on my machine (IPC P90), with the OPTi 82C928 based card. - -Booting the kernel through the boot manager LILO allows the use -of some command line options on the 'LILO boot:' prompt. At boot time -press Alt or Shift while the LILO prompt is written on the screen and enter -any kernel options. Alternatively these options may be used in -the appropriate section in /etc/lilo.conf. Adding 'append="<cmd_line_options>"' -will do the trick as well. -The syntax of 'cmd_line_options' is - - isp16=[<port>[,<irq>[,<dma>]]][[,]<drive_type>] - -If there is no ISP16 or compatibles detected, there's probably no harm done. -These options indicate the values that your cdrom drive has been (or will be) -configured to use. -Valid values for the base i/o address are: - port=0x340,0x320,0x330,0x360 -for the interrupt request number - irq=0,3,5,7,9,10,11 -for the direct memory access line - dma=0,3,5,6,7 -and for the type of drive - drive_type=noisp16,Sanyo,Panasonic,Sony,Mitsumi. -Note that these options are case sensitive. -The values 0 for irq and dma indicate that they are not used, and -the drive will be used in 'polling' mode. The values 5 and 7 for irq -should be avoided in order to avoid any conflicts with optional -sound card configuration. -The syntax of the command line does not allow the specification of -irq when there's nothing specified for the base address and no -specification of dma when there is no specification of irq. -The value 'noisp16' for drive_type, which may be used as the first -non-integer option value (e.g. 'isp16=noisp16'), makes sure that probing -for and subsequent configuration of an ISP16-compatible card is skipped -all together. This can be useful to overcome possible conflicts which -may arise while the kernel is probing your hardware. -The default values are - port=0x340 - irq=0 - dma=0 - drive_type=Sanyo -reflecting my own configuration. The defaults can be changed in -the file linux/drivers/cdrom/ips16.h. - -The cdrom interface can be configured at run time by loading the -initialisation driver as a module. In that case, the interface -parameters can be set by giving appropriate values on the command -line. Configuring the driver can then be done by the following -command (assuming you have iso16.o installed in a proper place): - - insmod isp16.o isp16_cdrom_base=<port> isp16_cdrom_irq=<irq> \ - isp16_cdrom_dma=<dma> isp16_cdrom_type=<drive_type> - -where port, irq, dma and drive_type can have any of the values mentioned -above. - - -Have fun! diff --git a/Documentation/cdrom/mcdx b/Documentation/cdrom/mcdx deleted file mode 100644 index 2bac4b7ff6da..000000000000 --- a/Documentation/cdrom/mcdx +++ /dev/null @@ -1,29 +0,0 @@ -If you are using the driver as a module, you can specify your ports and IRQs -like - - # insmod mcdx.o mcdx=0x300,11,0x304,5 - -and so on ("address,IRQ" pairs). -This will override the configuration in mcdx.h. - -This driver: - - o handles XA and (hopefully) multi session CDs as well as - ordinary CDs; - o supports up to 5 drives (of course, you'll need free - IRQs, i/o ports and slots); - o plays audio - -This version doesn't support yet: - - o shared IRQs (but it seems to be possible - I've successfully - connected two drives to the same irq. So it's `only' a - problem of the driver.) - -This driver never will: - - o Read digital audio (i.e. copy directly), due to missing - hardware features. - - -heiko@lotte.sax.de diff --git a/Documentation/cdrom/optcd b/Documentation/cdrom/optcd deleted file mode 100644 index 6f46c7adb243..000000000000 --- a/Documentation/cdrom/optcd +++ /dev/null @@ -1,57 +0,0 @@ -This is the README file for the Optics Storage 8000 AT CDROM device driver. - -This is the driver for the so-called 'DOLPHIN' drive, with the 34-pin -Sony-compatible interface. For the IDE-compatible Optics Storage 8001 -drive, you will want the ATAPI CDROM driver. The driver also seems to -work with the Lasermate CR328A. If you have a drive that works with -this driver, and that doesn't report itself as DOLPHIN, please drop me -a mail. - -The support for multisession CDs is in ALPHA stage. If you use it, -please mail me your experiences. Multisession support can be disabled -at compile time. - -You can find some older versions of the driver at - dutette.et.tudelft.nl:/pub/linux/ -and at Eberhard's mirror - ftp.gwdg.de:/pub/linux/cdrom/drivers/optics/ - -Before you can use the driver, you have to create the device file once: - # mknod /dev/optcd0 b 17 0 - -To specify the base address if the driver is "compiled-in" to your kernel, -you can use the kernel command line item (LILO option) - optcd=0x340 -with the right address. - -If you have compiled optcd as a module, you can load it with - # insmod /usr/src/linux/modules/optcd.o -or - # insmod /usr/src/linux/modules/optcd.o optcd=0x340 -with the matching address value of your interface card. - -The driver employs a number of buffers to do read-ahead and block size -conversion. The number of buffers is configurable in optcd.h, and has -influence on the driver performance. For my machine (a P75), 6 buffers -seems optimal, as can be seen from this table: - -#bufs kb/s %cpu -1 97 0.1 -2 191 0.3 -3 188 0.2 -4 246 0.3 -5 189 19 -6 280 0.4 -7 281 7.0 -8 246 2.8 -16 281 3.4 - -If you get a throughput significantly below 300 kb/s, try tweaking -N_BUFS, and don't forget to mail me your results! - -I'd appreciate success/failure reports. If you find a bug, try -recompiling the driver with some strategically chosen debug options -(these can be found in optcd.h) and include the messages generated in -your bug report. Good luck. - -Leo Spiekman (spiekman@dutette.et.tudelft.nl) diff --git a/Documentation/cdrom/sbpcd b/Documentation/cdrom/sbpcd deleted file mode 100644 index b3ba63f4ce3e..000000000000 --- a/Documentation/cdrom/sbpcd +++ /dev/null @@ -1,1061 +0,0 @@ -This README belongs to release 4.2 or newer of the SoundBlaster Pro -(Matsushita, Kotobuki, Panasonic, CreativeLabs, Longshine and Teac) -CD-ROM driver for Linux. - -sbpcd really, really is NOT for ANY IDE/ATAPI drive! -Not even if you have an "original" SoundBlaster card with an IDE interface! -So, you'd better have a look into README.ide if your port address is 0x1F0, -0x170, 0x1E8, 0x168 or similar. -I get tons of mails from IDE/ATAPI drive users - I really can't continue -any more to answer them all. So, if your drive/interface information sheets -mention "IDE" (primary, secondary, tertiary, quaternary) and the DOS driver -invoking line within your CONFIG.SYS is using an address below 0x230: -DON'T ROB MY LAST NERVE - jumper your interface to address 0x170 and IRQ 15 -(that is the "secondary IDE" configuration), set your drive to "master" and -use ide-cd as your driver. If you do not have a second IDE hard disk, use the -LILO commands - hdb=noprobe hdc=cdrom -and get lucky. -To make it fully clear to you: if you mail me about IDE/ATAPI drive problems, -my answer is above, and I simply will discard your mail, hoping to stop the -flood and to find time to lead my 12-year old son towards happy computing. - -The driver is able to drive the whole family of "traditional" AT-style (that -is NOT the new "Enhanced IDE" or "ATAPI" drive standard) Matsushita, -Kotobuki, Panasonic drives, sometimes labelled as "CreativeLabs". The -well-known drives are CR-521, CR-522, CR-523, CR-562, CR-563. -CR-574 is an IDE/ATAPI drive. - -The Longshine LCS-7260 is a double-speed drive which uses the "old" -Matsushita command set. It is supported - with help by Serge Robyns. -Vertos ("Elitegroup Computer Systems", ECS) has a similar drive - support -has started; get in contact if you have such a "Vertos 100" or "ECS-AT" -drive. - -There exists an "IBM External ISA CD-ROM Drive" which in fact is a CR-563 -with a special controller board. This drive is supported (the interface is -of the "LaserMate" type), and it is possibly the best buy today (cheaper than -an internal drive, and you can use it as an internal, too - e.g. plug it into -a soundcard). - -CreativeLabs has a new drive "CD200" and a similar drive "CD200F". The latter -is made by Funai and sometimes named "E2550UA", newer models may be named -"MK4015". The CD200F drives should fully work. -CD200 drives without "F" are still giving problems: drive detection and -playing audio should work, data access will result in errors. I need qualified -feedback about the bugs within the data functions or a drive (I never saw a -CD200). - -The quad-speed Teac CD-55A drive is supported, but still does not reach "full -speed". The data rate already reaches 500 kB/sec if you set SBP_BUFFER_FRAMES -to 64 (it is not recommended to do that for normal "file access" usage, but it -can speed up things a lot if you use something like "dd" to read from the -drive; I use it for verifying self-written CDs this way). -The drive itself is able to deliver 600 kB/sec, so this needs -work; with the normal setup, the performance currently is not even as good as -double-speed. - -This driver is NOT for Mitsumi or Sony or Aztech or Philips or XXX drives, -and again: this driver is in no way usable for any IDE/ATAPI drive. If you -think your drive should work and it doesn't: send me the DOS driver for your -beast (gzipped + uuencoded) and your CONFIG.SYS if you want to ask me for help, -and include an original log message excerpt, and try to give all information -a complete idiot needs to understand your hassle already with your first -mail. And if you want to say "as I have mailed you before", be sure that I -don't remember your "case" by such remarks; at the moment, I have some -hundreds of open correspondences about Linux CDROM questions (hope to reduce if -the IDE/ATAPI user questions disappear). - - -This driver will work with the soundcard interfaces (SB Pro, SB 16, Galaxy, -SoundFX, Mozart, MAD16 ...) and with the "no-sound" cards (Panasonic CI-101P, -LaserMate, WDH-7001C, Longshine LCS-6853, Teac ...). - -It works with the "configurable" interface "Sequoia S-1000", too, which is -used on the Spea Media FX and Ensonic Soundscape sound cards. You have to -specify the type "SBPRO 2" and the true CDROM port address with it, not the -"configuration port" address. - -If you have a sound card which needs a "configuration driver" instead of -jumpers for interface types and addresses (like Mozart cards) - those -drivers get invoked before the DOS CDROM driver in your CONFIG.SYS, typical -names are "cdsetup.sys" and "mztinit.sys" - let the sound driver do the -CDROM port configuration (the leading comments in linux/drivers/sound/mad16.c -are just for you!). Hannu Savolainen's mad16.c code is able to set up my -Mozart card - I simply had to add - #define MAD16_CONF 0x06 - #define MAD16_CDSEL 0x03 -to configure the CDROM interface for type "Panasonic" (LaserMate) and address -0x340. - -The interface type has to get configured in linux/drivers/cdrom/sbpcd.h, -because the register layout is different between the "SoundBlaster" and the -"LaserMate" type. - -I got a report that the Teac interface card "I/F E117098" is of type -"SoundBlaster" (i.e. you have to set SBPRO to 1) even with the addresses -0x300 and above. This is unusual, and it can't get covered by the auto -probing scheme. -The Teac 16-bit interface cards (like P/N E950228-00A, default address 0x2C0) -need the SBPRO 3 setup. - -If auto-probing found the drive, the address is correct. The reported type -may be wrong. A "mount" will give success only if the interface type is set -right. Playing audio should work with a wrong set interface type, too. - -With some Teac and some CD200 drives I have seen interface cards which seem -to lack the "drive select" lines; always drive 0 gets addressed. To avoid -"mirror drives" (four drives detected where you only have one) with such -interface cards, set MAX_DRIVES to 1 and jumper your drive to ID 0 (if -possible). - - -Up to 4 drives per interface card, and up to 4 interface cards are supported. -All supported drive families can be mixed, but the CR-521 drives are -hard-wired to drive ID 0. The drives have to use different drive IDs, and each -drive has to get a unique minor number (0...3), corresponding indirectly to -its drive ID. -The drive IDs may be selected freely from 0 to 3 - they do not have to be in -consecutive order. - -As Don Carroll, don@ds9.us.dell.com or FIDO 1:382/14, told me, it is possible -to change old drives to any ID, too. He writes in this sense: - "In order to be able to use more than one single speed drive - (they do not have the ID jumpers) you must add a DIP switch - and two resistors. The pads are already on the board next to - the power connector. You will see the silkscreen for the - switch if you remove the top cover. - 1 2 3 4 - ID 0 = x F F x O = "on" - ID 1 = x O F x F = "off" - ID 2 = x F O x x = "don't care" - ID 3 = x O O x - Next to the switch are the positions for R76 (7k) and R78 - (12k). I had to play around with the resistor values - ID 3 - did not work with other values. If the values are not good, - ID 3 behaves like ID 0." - -To use more than 4 drives, you simply need a second controller card at a -different address and a second cable. - -The driver supports reading of data from the CD and playing of audio tracks. -The audio part should run with WorkMan, xcdplayer, with the "non-X11" products -CDplayer and WorkBone - tell me if it is not compatible with other software. -The only accepted measure for correctness with the audio functions is the -"cdtester" utility (appended) - most audio player programmers seem to be -better musicians than programmers. ;-) - -With the CR-56x and the CD200 drives, the reading of audio frames is possible. -This is implemented by an IOCTL function which reads READ_AUDIO frames of -2352 bytes at once (configurable with the "READ_AUDIO" define, default is 0). -Reading the same frame a second time gives different data; the frame data -start at a different position, but all read bytes are valid, and we always -read 98 consecutive chunks (of 24 Bytes) as a frame. Reading more than 1 frame -at once possibly misses some chunks at each frame boundary. This lack has to -get corrected by external, "higher level" software which reads the same frame -again and tries to find and eliminate overlapping chunks (24-byte-pieces). - -The transfer rate with reading audio (1-frame-pieces) currently is very slow. -This can be better reading bigger chunks, but the "missing" chunks possibly -occur at the beginning of each single frame. -The software interface possibly may change a bit the day the SCSI driver -supports it too. - -With all but the CR-52x drives, MultiSession is supported. -Photo CDs work (the "old" drives like CR-521 can access only the first -session of a photoCD). -At ftp.gwdg.de:/pub/linux/hpcdtoppm/ you will find Hadmut Danisch's package to -convert photo CD image files and Gerd Knorr's viewing utility. - -The transfer rate will reach 150 kB/sec with CR-52x drives, 300 kB/sec with -CR-56x drives, and currently not more than 500 kB/sec (usually less than -250 kB/sec) with the Teac quad speed drives. -XA (PhotoCD) disks with "old" drives give only 50 kB/sec. - -This release consists of -- this README file -- the driver file linux/drivers/cdrom/sbpcd.c -- the stub files linux/drivers/cdrom/sbpcd[234].c -- the header file linux/drivers/cdrom/sbpcd.h. - - -To install: ------------ - -1. Setup your hardware parameters. Though the driver does "auto-probing" at a - lot of (not all possible!) addresses, this step is recommended for - everyday use. You should let sbpcd auto-probe once and use the reported - address if a drive got found. The reported type may be incorrect; it is - correct if you can mount a data CD. There is no choice for you with the - type; only one is right, the others are deadly wrong. - - a. Go into /usr/src/linux/drivers/cdrom/sbpcd.h and configure it for your - hardware (near the beginning): - a1. Set it up for the appropriate type of interface board. - "Original" CreativeLabs sound cards need "SBPRO 1". - Most "compatible" sound cards (almost all "non-CreativeLabs" cards) - need "SBPRO 0". - The "no-sound" board from OmniCd needs the "SBPRO 1" setup. - The Teac 8-bit "no-sound" boards need the "SBPRO 1" setup. - The Teac 16-bit "no-sound" boards need the "SBPRO 3" setup. - All other "no-sound" boards need the "SBPRO 0" setup. - The Spea Media FX and Ensoniq SoundScape cards need "SBPRO 2". - sbpcd.c holds some examples in its auto-probe list. - If you configure "SBPRO" wrong, the playing of audio CDs will work, - but you will not be able to mount a data CD. - a2. Tell the address of your CDROM_PORT (not of the sound port). - a3. If 4 drives get found, but you have only one, set MAX_DRIVES to 1. - a4. Set DISTRIBUTION to 0. - b. Additionally for 2.a1 and 2.a2, the setup may be done during - boot time (via the "kernel command line" or "LILO option"): - sbpcd=0x320,LaserMate - or - sbpcd=0x230,SoundBlaster - or - sbpcd=0x338,SoundScape - or - sbpcd=0x2C0,Teac16bit - This is especially useful if you install a fresh distribution. - If the second parameter is a number, it gets taken as the type - setting; 0 is "LaserMate", 1 is "SoundBlaster", 2 is "SoundScape", - 3 is "Teac16bit". - So, for example - sbpcd=0x230,1 - is equivalent to - sbpcd=0x230,SoundBlaster - -2. "cd /usr/src/linux" and do a "make config" and select "y" for Matsushita - CD-ROM support and for ISO9660 FileSystem support. If you do not have a - second, third, or fourth controller installed, do not say "y" to the - secondary Matsushita CD-ROM questions. - -3. Then make the kernel image ("make zlilo" or similar). - -4. Make the device file(s). This step usually already has been done by the - MAKEDEV script. - The driver uses MAJOR 25, so, if necessary, do - mknod /dev/sbpcd b 25 0 (if you have only one drive) - and/or - mknod /dev/sbpcd0 b 25 0 - mknod /dev/sbpcd1 b 25 1 - mknod /dev/sbpcd2 b 25 2 - mknod /dev/sbpcd3 b 25 3 - to make the node(s). - - The "first found" drive gets MINOR 0 (regardless of its jumpered ID), the - "next found" (at the same cable) gets MINOR 1, ... - - For a second interface board, you have to make nodes like - mknod /dev/sbpcd4 b 26 0 - mknod /dev/sbpcd5 b 26 1 - and so on. Use the MAJORs 26, 27, 28. - - If you further make a link like - ln -s sbpcd /dev/cdrom - you can use the name /dev/cdrom, too. - -5. Reboot with the new kernel. - -You should now be able to do - mkdir /CD -and - mount -rt iso9660 /dev/sbpcd /CD -or - mount -rt iso9660 -o block=2048 /dev/sbpcd /CD -and see the contents of your CD in the /CD directory. -To use audio CDs, a mounting is not recommended (and it would fail if the -first track is not a data track). - - -Using sbpcd as a "loadable module": ------------------------------------ - -If you do NOT select "Matsushita/Panasonic CDROM driver support" during the -"make config" of your kernel, you can build the "loadable module" sbpcd.o. - -If sbpcd gets used as a module, the support of more than one interface -card (i.e. drives 4...15) is disabled. - -You can specify interface address and type with the "insmod" command like: - # insmod /usr/src/linux/modules/sbpcd.o sbpcd=0x340,0 -or - # insmod /usr/src/linux/modules/sbpcd.o sbpcd=0x230,1 -or - # insmod /usr/src/linux/modules/sbpcd.o sbpcd=0x338,2 -where the last number represents the SBPRO setting (no strings allowed here). - - -Things of interest: -------------------- - -The driver is configured to try the LaserMate type of interface at I/O port -0x0340 first. If this is not appropriate, sbpcd.h should get changed -(you will find the right place - just at the beginning). - -No DMA and no IRQ is used. - -To reduce or increase the amount of kernel messages, edit sbpcd.c and play -with the "DBG_xxx" switches (initialization of the variable "sbpcd_debug"). -Don't forget to reflect on what you do; enabling all DBG_xxx switches at once -may crash your system, and each message line is accompanied by a delay. - -The driver uses the "variable BLOCK_SIZE" feature. To use it, you have to -specify "block=2048" as a mount option. Doing this will disable the direct -execution of a binary from the CD; you have to copy it to a device with the -standard BLOCK_SIZE (1024) first. So, do not use this if your system is -directly "running from the CDROM" (like some of Yggdrasil's installation -variants). There are CDs on the market (like the German "unifix" Linux -distribution) which MUST get handled with a block_size of 1024. Generally, -one can say all the CDs which hold files of the name YMTRANS.TBL are defective; -do not use block=2048 with those. - -Within sbpcd.h, you will find some "#define"s (e.g. EJECT and JUKEBOX). With -these, you can configure the driver for some special things. -You can use the appended program "cdtester" to set the auto-eject feature -during runtime. Jeff Tranter's "eject" utility can do this, too (and more) -for you. - -There is an ioctl CDROMMULTISESSION to obtain with a user program if -the CD is an XA disk and - if it is - where the last session starts. The -"cdtester" program illustrates how to call it. - - -Auto-probing at boot time: --------------------------- - -The driver does auto-probing at many well-known interface card addresses, -but not all: -Some probings can cause a hang if an NE2000 ethernet card gets touched, because -SBPCD's auto-probing happens before the initialization of the net drivers. -Those "hazardous" addresses are excluded from auto-probing; the "kernel -command line" feature has to be used during installation if you have your -drive at those addresses. The "module" version is allowed to probe at those -addresses, too. - -The auto-probing looks first at the configured address resp. the address -submitted by the kernel command line. With this, it is possible to use this -driver within installation boot floppies, and for any non-standard address, -too. - -Auto-probing will make an assumption about the interface type ("SBPRO" or not), -based upon the address. That assumption may be wrong (initialization will be -o.k., but you will get I/O errors during mount). In that case, use the "kernel -command line" feature and specify address & type at boot time to find out the -right setup. - -For everyday use, address and type should get configured within sbpcd.h. That -will stop the auto-probing due to success with the first try. - -The kernel command "sbpcd=0" suppresses each auto-probing and causes -the driver not to find any drive; it is meant for people who love sbpcd -so much that they do not want to miss it, even if they miss the drives. ;-) - -If you configure "#define CDROM_PORT 0" in sbpcd.h, the auto-probing is -initially disabled and needs an explicit kernel command to get activated. -Once activated, it does not stop before success or end-of-list. This may be -useful within "universal" CDROM installation boot floppies (but using the -loadable module would be better because it allows an "extended" auto-probing -without fearing NE2000 cards). - -To shorten the auto-probing list to a single entry, set DISTRIBUTION 0 within -sbpcd.h. - - -Setting up address and interface type: --------------------------------------- - -If your I/O port address is not 0x340, you have to look for the #defines near -the beginning of sbpcd.h and configure them: set SBPRO to 0 or 1 or 2, and -change CDROM_PORT to the address of your CDROM I/O port. - -Almost all of the "SoundBlaster compatible" cards behave like the no-sound -interfaces, i.e. need SBPRO 0! - -With "original" SB Pro cards, an initial setting of CD_volume through the -sound card's MIXER register gets done. -If you are using a "compatible" sound card of types "LaserMate" or "SPEA", -you can set SOUND_BASE (in sbpcd.h) to get it done with your card, too... - - -Using audio CDs: ----------------- - -Workman, WorkBone, xcdplayer, cdplayer and the nice little tool "cdplay" (see -README.aztcd from the Aztech driver package) should work. - -The program CDplayer likes to talk to "/dev/mcd" only, xcdplayer wants -"/dev/rsr0", workman loves "/dev/sr0" or "/dev/cdrom" - so, make the -appropriate links to use them without the need to supply parameters. - - -Copying audio tracks: ---------------------- - -The following program will copy track 1 (or a piece of it) from an audio CD -into the file "track01": - -/*=================== begin program ========================================*/ -/* - * read an audio track from a CD - * - * (c) 1994 Eberhard Moenkeberg <emoenke@gwdg.de> - * may be used & enhanced freely - * - * Due to non-existent sync bytes at the beginning of each audio frame (or due - * to a firmware bug within all known drives?), it is currently a kind of - * fortune if two consecutive frames fit together. - * Usually, they overlap, or a little piece is missing. This happens in units - * of 24-byte chunks. It has to get fixed by higher-level software (reading - * until an overlap occurs, and then eliminate the overlapping chunks). - * ftp.gwdg.de:/pub/linux/misc/cdda2wav-sbpcd.*.tar.gz holds an example of - * such an algorithm. - * This example program further is missing to obtain the SubChannel data - * which belong to each frame. - * - * This is only an example of the low-level access routine. The read data are - * pure 16-bit CDDA values; they have to get converted to make sound out of - * them. - * It is no fun to listen to it without prior overlap/underlap correction! - */ -#include <stdio.h> -#include <sys/ioctl.h> -#include <sys/types.h> -#include <linux/cdrom.h> - -static struct cdrom_tochdr hdr; -static struct cdrom_tocentry entry[101]; -static struct cdrom_read_audio arg; -static u_char buffer[CD_FRAMESIZE_RAW]; -static int datafile, drive; -static int i, j, limit, track, err; -static char filename[32]; - -int main(int argc, char *argv[]) -{ -/* - * open /dev/cdrom - */ - drive=open("/dev/cdrom", 0); - if (drive<0) - { - fprintf(stderr, "can't open drive.\n"); - exit (-1); - } -/* - * get TocHeader - */ - fprintf(stdout, "getting TocHeader...\n"); - err=ioctl(drive, CDROMREADTOCHDR, &hdr); - if (err!=0) - { - fprintf(stderr, "can't get TocHeader (error %d).\n", err); - exit (-1); - } - else - fprintf(stdout, "TocHeader: %d %d\n", hdr.cdth_trk0, hdr.cdth_trk1); -/* - * get and display all TocEntries - */ - fprintf(stdout, "getting TocEntries...\n"); - for (i=1;i<=hdr.cdth_trk1+1;i++) - { - if (i!=hdr.cdth_trk1+1) entry[i].cdte_track = i; - else entry[i].cdte_track = CDROM_LEADOUT; - entry[i].cdte_format = CDROM_LBA; - err=ioctl(drive, CDROMREADTOCENTRY, &entry[i]); - if (err!=0) - { - fprintf(stderr, "can't get TocEntry #%d (error %d).\n", i, err); - exit (-1); - } - else - { - fprintf(stdout, "TocEntry #%d: %1X %1X %06X %02X\n", - entry[i].cdte_track, - entry[i].cdte_adr, - entry[i].cdte_ctrl, - entry[i].cdte_addr.lba, - entry[i].cdte_datamode); - } - } - fprintf(stdout, "got all TocEntries.\n"); -/* - * ask for track number (not implemented here) - */ -track=1; -#if 0 /* just read a little piece (4 seconds) */ -entry[track+1].cdte_addr.lba=entry[track].cdte_addr.lba+300; -#endif -/* - * read track into file - */ - sprintf(filename, "track%02d\0", track); - datafile=creat(filename, 0755); - if (datafile<0) - { - fprintf(stderr, "can't open datafile %s.\n", filename); - exit (-1); - } - arg.addr.lba=entry[track].cdte_addr.lba; - arg.addr_format=CDROM_LBA; /* CDROM_MSF would be possible here, too. */ - arg.nframes=1; - arg.buf=&buffer[0]; - limit=entry[track+1].cdte_addr.lba; - for (;arg.addr.lba<limit;arg.addr.lba++) - { - err=ioctl(drive, CDROMREADAUDIO, &arg); - if (err!=0) - { - fprintf(stderr, "can't read abs. frame #%d (error %d).\n", - arg.addr.lba, err); - } - j=write(datafile, &buffer[0], CD_FRAMESIZE_RAW); - if (j!=CD_FRAMESIZE_RAW) - { - fprintf(stderr,"I/O error (datafile) at rel. frame %d\n", - arg.addr.lba-entry[track].cdte_addr.lba); - } - arg.addr.lba++; - } - return 0; -} -/*===================== end program ========================================*/ - -At ftp.gwdg.de:/pub/linux/misc/cdda2wav-sbpcd.*.tar.gz is an adapted version of -Heiko Eissfeldt's digital-audio to .WAV converter (the original is there, too). -This is preliminary, as Heiko himself will care about it. - - -Known problems: ---------------- - -Currently, the detection of disk change or removal is actively disabled. - -Most attempts to read the UPC/EAN code result in a stream of zeroes. All my -drives are mostly telling there is no UPC/EAN code on disk or there is, but it -is an all-zero number. I guess now almost no CD holds such a number. - -Bug reports, comments, wishes, donations (technical information is a donation, -too :-) etc. to emoenke@gwdg.de. - -SnailMail address, preferable for CD editors if they want to submit a free -"cooperation" copy: - Eberhard Moenkeberg - Reinholdstr. 14 - D-37083 Goettingen - Germany ---- - - -Appendix -- the "cdtester" utility: - -/* - * cdtester.c -- test the audio functions of a CD driver - * - * (c) 1995 Eberhard Moenkeberg <emoenke@gwdg.de> - * published under the GPL - * - * made under heavy use of the "Tiny Audio CD Player" - * from Werner Zimmermann <zimmerma@rz.fht-esslingen.de> - * (see linux/drivers/block/README.aztcd) - */ -#undef AZT_PRIVATE_IOCTLS /* not supported by every CDROM driver */ -#define SBP_PRIVATE_IOCTLS /* not supported by every CDROM driver */ - -#include <stdio.h> -#include <stdio.h> -#include <malloc.h> -#include <sys/ioctl.h> -#include <sys/types.h> -#include <linux/cdrom.h> - -#ifdef AZT_PRIVATE_IOCTLS -#include <linux/../../drivers/cdrom/aztcd.h> -#endif /* AZT_PRIVATE_IOCTLS */ -#ifdef SBP_PRIVATE_IOCTLS -#include <linux/../../drivers/cdrom/sbpcd.h> -#include <linux/fs.h> -#endif /* SBP_PRIVATE_IOCTLS */ - -struct cdrom_tochdr hdr; -struct cdrom_tochdr tocHdr; -struct cdrom_tocentry TocEntry[101]; -struct cdrom_tocentry entry; -struct cdrom_multisession ms_info; -struct cdrom_read_audio read_audio; -struct cdrom_ti ti; -struct cdrom_subchnl subchnl; -struct cdrom_msf msf; -struct cdrom_volctrl volctrl; -#ifdef AZT_PRIVATE_IOCTLS -union -{ - struct cdrom_msf msf; - unsigned char buf[CD_FRAMESIZE_RAW]; -} azt; -#endif /* AZT_PRIVATE_IOCTLS */ -int i, i1, i2, i3, j, k; -unsigned char sequence=0; -unsigned char command[80]; -unsigned char first=1, last=1; -char *default_device="/dev/cdrom"; -char dev[20]; -char filename[20]; -int drive; -int datafile; -int rc; - -void help(void) -{ - printf("Available Commands:\n"); - printf("STOP s EJECT e QUIT q\n"); - printf("PLAY TRACK t PAUSE p RESUME r\n"); - printf("NEXT TRACK n REPEAT LAST l HELP h\n"); - printf("SUBCHANNEL_Q c TRACK INFO i PLAY AT a\n"); - printf("READ d READ RAW w READ AUDIO A\n"); - printf("MS-INFO M TOC T START S\n"); - printf("SET EJECTSW X DEVICE D DEBUG Y\n"); - printf("AUDIO_BUFSIZ Z RESET R SET VOLUME v\n"); - printf("GET VOLUME V\n"); -} - -/* - * convert MSF number (3 bytes only) to Logical_Block_Address - */ -int msf2lba(u_char *msf) -{ - int i; - - i=(msf[0] * CD_SECS + msf[1]) * CD_FRAMES + msf[2] - CD_BLOCK_OFFSET; - if (i<0) return (0); - return (i); -} -/* - * convert logical_block_address to m-s-f_number (3 bytes only) - */ -void lba2msf(int lba, unsigned char *msf) -{ - lba += CD_BLOCK_OFFSET; - msf[0] = lba / (CD_SECS*CD_FRAMES); - lba %= CD_SECS*CD_FRAMES; - msf[1] = lba / CD_FRAMES; - msf[2] = lba % CD_FRAMES; -} - -int init_drive(char *dev) -{ - unsigned char msf_ent[3]; - - /* - * open the device - */ - drive=open(dev,0); - if (drive<0) return (-1); - /* - * get TocHeader - */ - printf("getting TocHeader...\n"); - rc=ioctl(drive,CDROMREADTOCHDR,&hdr); - if (rc!=0) - { - printf("can't get TocHeader (error %d).\n",rc); - return (-2); - } - else - first=hdr.cdth_trk0; - last=hdr.cdth_trk1; - printf("TocHeader: %d %d\n",hdr.cdth_trk0,hdr.cdth_trk1); - /* - * get and display all TocEntries - */ - printf("getting TocEntries...\n"); - for (i=1;i<=hdr.cdth_trk1+1;i++) - { - if (i!=hdr.cdth_trk1+1) TocEntry[i].cdte_track = i; - else TocEntry[i].cdte_track = CDROM_LEADOUT; - TocEntry[i].cdte_format = CDROM_LBA; - rc=ioctl(drive,CDROMREADTOCENTRY,&TocEntry[i]); - if (rc!=0) - { - printf("can't get TocEntry #%d (error %d).\n",i,rc); - } - else - { - lba2msf(TocEntry[i].cdte_addr.lba,&msf_ent[0]); - if (TocEntry[i].cdte_track==CDROM_LEADOUT) - { - printf("TocEntry #%02X: %1X %1X %02d:%02d:%02d (lba: 0x%06X) %02X\n", - TocEntry[i].cdte_track, - TocEntry[i].cdte_adr, - TocEntry[i].cdte_ctrl, - msf_ent[0], - msf_ent[1], - msf_ent[2], - TocEntry[i].cdte_addr.lba, - TocEntry[i].cdte_datamode); - } - else - { - printf("TocEntry #%02d: %1X %1X %02d:%02d:%02d (lba: 0x%06X) %02X\n", - TocEntry[i].cdte_track, - TocEntry[i].cdte_adr, - TocEntry[i].cdte_ctrl, - msf_ent[0], - msf_ent[1], - msf_ent[2], - TocEntry[i].cdte_addr.lba, - TocEntry[i].cdte_datamode); - } - } - } - return (hdr.cdth_trk1); /* number of tracks */ -} - -void display(int size,unsigned char *buffer) -{ - k=0; - getchar(); - for (i=0;i<(size+1)/16;i++) - { - printf("%4d:",i*16); - for (j=0;j<16;j++) - { - printf(" %02X",buffer[i*16+j]); - } - printf(" "); - for (j=0;j<16;j++) - { - if (isalnum(buffer[i*16+j])) - printf("%c",buffer[i*16+j]); - else - printf("."); - } - printf("\n"); - k++; - if (k>=20) - { - printf("press ENTER to continue\n"); - getchar(); - k=0; - } - } -} - -int main(int argc, char *argv[]) -{ - printf("\nTesting tool for a CDROM driver's audio functions V0.1\n"); - printf("(C) 1995 Eberhard Moenkeberg <emoenke@gwdg.de>\n"); - printf("initializing...\n"); - - rc=init_drive(default_device); - if (rc<0) printf("could not open %s (rc=%d).\n",default_device,rc); - help(); - while (1) - { - printf("Give a one-letter command (h = help): "); - scanf("%s",command); - command[1]=0; - switch (command[0]) - { - case 'D': - printf("device name (f.e. /dev/sbpcd3): ? "); - scanf("%s",&dev); - close(drive); - rc=init_drive(dev); - if (rc<0) printf("could not open %s (rc %d).\n",dev,rc); - break; - case 'e': - rc=ioctl(drive,CDROMEJECT); - if (rc<0) printf("CDROMEJECT: rc=%d.\n",rc); - break; - case 'p': - rc=ioctl(drive,CDROMPAUSE); - if (rc<0) printf("CDROMPAUSE: rc=%d.\n",rc); - break; - case 'r': - rc=ioctl(drive,CDROMRESUME); - if (rc<0) printf("CDROMRESUME: rc=%d.\n",rc); - break; - case 's': - rc=ioctl(drive,CDROMSTOP); - if (rc<0) printf("CDROMSTOP: rc=%d.\n",rc); - break; - case 'S': - rc=ioctl(drive,CDROMSTART); - if (rc<0) printf("CDROMSTART: rc=%d.\n",rc); - break; - case 't': - rc=ioctl(drive,CDROMREADTOCHDR,&tocHdr); - if (rc<0) - { - printf("CDROMREADTOCHDR: rc=%d.\n",rc); - break; - } - first=tocHdr.cdth_trk0; - last= tocHdr.cdth_trk1; - if ((first==0)||(first>last)) - { - printf ("--got invalid TOC data.\n"); - } - else - { - printf("--enter track number(first=%d, last=%d): ",first,last); - scanf("%d",&i1); - ti.cdti_trk0=i1; - if (ti.cdti_trk0<first) ti.cdti_trk0=first; - if (ti.cdti_trk0>last) ti.cdti_trk0=last; - ti.cdti_ind0=0; - ti.cdti_trk1=last; - ti.cdti_ind1=0; - rc=ioctl(drive,CDROMSTOP); - rc=ioctl(drive,CDROMPLAYTRKIND,&ti); - if (rc<0) printf("CDROMPLAYTRKIND: rc=%d.\n",rc); - } - break; - case 'n': - rc=ioctl(drive,CDROMSTOP); - if (++ti.cdti_trk0>last) ti.cdti_trk0=last; - ti.cdti_ind0=0; - ti.cdti_trk1=last; - ti.cdti_ind1=0; - rc=ioctl(drive,CDROMPLAYTRKIND,&ti); - if (rc<0) printf("CDROMPLAYTRKIND: rc=%d.\n",rc); - break; - case 'l': - rc=ioctl(drive,CDROMSTOP); - if (--ti.cdti_trk0<first) ti.cdti_trk0=first; - ti.cdti_ind0=0; - ti.cdti_trk1=last; - ti.cdti_ind1=0; - rc=ioctl(drive,CDROMPLAYTRKIND,&ti); - if (rc<0) printf("CDROMPLAYTRKIND: rc=%d.\n",rc); - break; - case 'c': - subchnl.cdsc_format=CDROM_MSF; - rc=ioctl(drive,CDROMSUBCHNL,&subchnl); - if (rc<0) printf("CDROMSUBCHNL: rc=%d.\n",rc); - else - { - printf("AudioStatus:%s Track:%d Mode:%d MSF=%02d:%02d:%02d\n", - subchnl.cdsc_audiostatus==CDROM_AUDIO_PLAY ? "PLAYING":"NOT PLAYING", - subchnl.cdsc_trk,subchnl.cdsc_adr, - subchnl.cdsc_absaddr.msf.minute, - subchnl.cdsc_absaddr.msf.second, - subchnl.cdsc_absaddr.msf.frame); - } - break; - case 'i': - printf("Track No.: "); - scanf("%d",&i1); - entry.cdte_track=i1; - if (entry.cdte_track<first) entry.cdte_track=first; - if (entry.cdte_track>last) entry.cdte_track=last; - entry.cdte_format=CDROM_MSF; - rc=ioctl(drive,CDROMREADTOCENTRY,&entry); - if (rc<0) printf("CDROMREADTOCENTRY: rc=%d.\n",rc); - else - { - printf("Mode %d Track, starts at %02d:%02d:%02d\n", - entry.cdte_adr, - entry.cdte_addr.msf.minute, - entry.cdte_addr.msf.second, - entry.cdte_addr.msf.frame); - } - break; - case 'a': - printf("Address (min:sec:frm) "); - scanf("%d:%d:%d",&i1,&i2,&i3); - msf.cdmsf_min0=i1; - msf.cdmsf_sec0=i2; - msf.cdmsf_frame0=i3; - if (msf.cdmsf_sec0>59) msf.cdmsf_sec0=59; - if (msf.cdmsf_frame0>74) msf.cdmsf_frame0=74; - lba2msf(TocEntry[last+1].cdte_addr.lba-1,&msf.cdmsf_min1); - rc=ioctl(drive,CDROMSTOP); - rc=ioctl(drive,CDROMPLAYMSF,&msf); - if (rc<0) printf("CDROMPLAYMSF: rc=%d.\n",rc); - break; - case 'V': - rc=ioctl(drive,CDROMVOLREAD,&volctrl); - if (rc<0) printf("CDROMVOLCTRL: rc=%d.\n",rc); - printf("Volume: channel 0 (left) %d, channel 1 (right) %d\n",volctrl.channel0,volctrl.channel1); - break; - case 'R': - rc=ioctl(drive,CDROMRESET); - if (rc<0) printf("CDROMRESET: rc=%d.\n",rc); - break; -#ifdef AZT_PRIVATE_IOCTLS /*not supported by every CDROM driver*/ - case 'd': - printf("Address (min:sec:frm) "); - scanf("%d:%d:%d",&i1,&i2,&i3); - azt.msf.cdmsf_min0=i1; - azt.msf.cdmsf_sec0=i2; - azt.msf.cdmsf_frame0=i3; - if (azt.msf.cdmsf_sec0>59) azt.msf.cdmsf_sec0=59; - if (azt.msf.cdmsf_frame0>74) azt.msf.cdmsf_frame0=74; - rc=ioctl(drive,CDROMREADMODE1,&azt.msf); - if (rc<0) printf("CDROMREADMODE1: rc=%d.\n",rc); - else display(CD_FRAMESIZE,azt.buf); - break; - case 'w': - printf("Address (min:sec:frame) "); - scanf("%d:%d:%d",&i1,&i2,&i3); - azt.msf.cdmsf_min0=i1; - azt.msf.cdmsf_sec0=i2; - azt.msf.cdmsf_frame0=i3; - if (azt.msf.cdmsf_sec0>59) azt.msf.cdmsf_sec0=59; - if (azt.msf.cdmsf_frame0>74) azt.msf.cdmsf_frame0=74; - rc=ioctl(drive,CDROMREADMODE2,&azt.msf); - if (rc<0) printf("CDROMREADMODE2: rc=%d.\n",rc); - else display(CD_FRAMESIZE_RAW,azt.buf); /* currently only 2336 */ - break; -#endif - case 'v': - printf("--Channel 0 (Left) (0-255): "); - scanf("%d",&i1); - volctrl.channel0=i1; - printf("--Channel 1 (Right) (0-255): "); - scanf("%d",&i1); - volctrl.channel1=i1; - volctrl.channel2=0; - volctrl.channel3=0; - rc=ioctl(drive,CDROMVOLCTRL,&volctrl); - if (rc<0) printf("CDROMVOLCTRL: rc=%d.\n",rc); - break; - case 'q': - close(drive); - exit(0); - case 'h': - help(); - break; - case 'T': /* display TOC entry - without involving the driver */ - scanf("%d",&i); - if ((i<hdr.cdth_trk0)||(i>hdr.cdth_trk1)) - printf("invalid track number.\n"); - else - printf("TocEntry %02d: adr=%01X ctrl=%01X msf=%02d:%02d:%02d mode=%02X\n", - TocEntry[i].cdte_track, - TocEntry[i].cdte_adr, - TocEntry[i].cdte_ctrl, - TocEntry[i].cdte_addr.msf.minute, - TocEntry[i].cdte_addr.msf.second, - TocEntry[i].cdte_addr.msf.frame, - TocEntry[i].cdte_datamode); - break; - case 'A': /* read audio data into file */ - printf("Address (min:sec:frm) ? "); - scanf("%d:%d:%d",&i1,&i2,&i3); - read_audio.addr.msf.minute=i1; - read_audio.addr.msf.second=i2; - read_audio.addr.msf.frame=i3; - read_audio.addr_format=CDROM_MSF; - printf("# of frames ? "); - scanf("%d",&i1); - read_audio.nframes=i1; - k=read_audio.nframes*CD_FRAMESIZE_RAW; - read_audio.buf=malloc(k); - if (read_audio.buf==NULL) - { - printf("can't malloc %d bytes.\n",k); - break; - } - sprintf(filename,"audio_%02d%02d%02d_%02d.%02d\0", - read_audio.addr.msf.minute, - read_audio.addr.msf.second, - read_audio.addr.msf.frame, - read_audio.nframes, - ++sequence); - datafile=creat(filename, 0755); - if (datafile<0) - { - printf("can't open datafile %s.\n",filename); - break; - } - rc=ioctl(drive,CDROMREADAUDIO,&read_audio); - if (rc!=0) - { - printf("CDROMREADAUDIO: rc=%d.\n",rc); - } - else - { - rc=write(datafile,&read_audio.buf,k); - if (rc!=k) printf("datafile I/O error (%d).\n",rc); - } - close(datafile); - break; - case 'X': /* set EJECT_SW (0: disable, 1: enable auto-ejecting) */ - scanf("%d",&i); - rc=ioctl(drive,CDROMEJECT_SW,i); - if (rc!=0) - printf("CDROMEJECT_SW: rc=%d.\n",rc); - else - printf("EJECT_SW set to %d\n",i); - break; - case 'M': /* get the multisession redirection info */ - ms_info.addr_format=CDROM_LBA; - rc=ioctl(drive,CDROMMULTISESSION,&ms_info); - if (rc!=0) - { - printf("CDROMMULTISESSION(lba): rc=%d.\n",rc); - } - else - { - if (ms_info.xa_flag) printf("MultiSession offset (lba): %d (0x%06X)\n",ms_info.addr.lba,ms_info.addr.lba); - else - { - printf("this CD is not an XA disk.\n"); - break; - } - } - ms_info.addr_format=CDROM_MSF; - rc=ioctl(drive,CDROMMULTISESSION,&ms_info); - if (rc!=0) - { - printf("CDROMMULTISESSION(msf): rc=%d.\n",rc); - } - else - { - if (ms_info.xa_flag) - printf("MultiSession offset (msf): %02d:%02d:%02d (0x%02X%02X%02X)\n", - ms_info.addr.msf.minute, - ms_info.addr.msf.second, - ms_info.addr.msf.frame, - ms_info.addr.msf.minute, - ms_info.addr.msf.second, - ms_info.addr.msf.frame); - else printf("this CD is not an XA disk.\n"); - } - break; -#ifdef SBP_PRIVATE_IOCTLS - case 'Y': /* set the driver's message level */ -#if 0 /* not implemented yet */ - printf("enter switch name (f.e. DBG_CMD): "); - scanf("%s",&dbg_switch); - j=get_dbg_num(dbg_switch); -#else - printf("enter DDIOCSDBG switch number: "); - scanf("%d",&j); -#endif - printf("enter 0 for \"off\", 1 for \"on\": "); - scanf("%d",&i); - if (i==0) j|=0x80; - printf("calling \"ioctl(drive,DDIOCSDBG,%d)\"\n",j); - rc=ioctl(drive,DDIOCSDBG,j); - printf("DDIOCSDBG: rc=%d.\n",rc); - break; - case 'Z': /* set the audio buffer size */ - printf("# frames wanted: ? "); - scanf("%d",&j); - rc=ioctl(drive,CDROMAUDIOBUFSIZ,j); - printf("%d frames granted.\n",rc); - break; -#endif /* SBP_PRIVATE_IOCTLS */ - default: - printf("unknown command: \"%s\".\n",command); - break; - } - } - return 0; -} -/*==========================================================================*/ - diff --git a/Documentation/cdrom/sjcd b/Documentation/cdrom/sjcd deleted file mode 100644 index 74a14847b93a..000000000000 --- a/Documentation/cdrom/sjcd +++ /dev/null @@ -1,60 +0,0 @@ - -- Documentation/cdrom/sjcd - 80% of the work takes 20% of the time, - 20% of the work takes 80% of the time... - (Murphy's law) - - Once started, training can not be stopped... - (Star Wars) - -This is the README for the sjcd cdrom driver, version 1.6. - -This file is meant as a tips & tricks edge for the usage of the SANYO CDR-H94A -cdrom drive. It will grow as the questions arise. ;-) -For info on configuring the ISP16 sound card look at Documentation/cdrom/isp16. - -The driver should work with any of the Panasonic, Sony or Mitsumi style -CDROM interfaces. -The cdrom interface on Media Magic's soft configurable sound card ISP16, -which used to be included in the driver, is now supported in a separate module. -This initialisation module will probably also work with other interfaces -based on an OPTi 82C928 or 82C929 chip (like MAD16 and Mozart): see the -documentation Documentation/cdrom/isp16. - -The device major for sjcd is 18, and minor is 0. Create a block special -file in your /dev directory (e.g., /dev/sjcd) with these numbers. -(For those who don't know, being root and doing the following should do -the trick: - mknod -m 644 /dev/sjcd b 18 0 -and mount the cdrom by /dev/sjcd). - -The default configuration parameters are: - base address 0x340 - no irq - no dma -(Actually the CDR-H94A doesn't know how to use irq and dma.) -As of version 1.2, setting base address at boot time is supported -through the use of command line options: type at the "boot:" prompt: - linux sjcd=<base_address> -(where you would use the kernel labeled "linux" in lilo's configuration -file /etc/lilo.conf). You could also use 'append="sjcd=<configuration_info>"' -in the appropriate section of /etc/lilo.conf -If you're building a kernel yourself you can set your default base -i/o address with SJCD_BASE_ADDR in /usr/src/linux/drivers/cdrom/sjcd.h. - -The sjcd driver supports being loaded as a module. The following -command will set the base i/o address on the fly (assuming you -have installed the module in an appropriate place). - insmod sjcd.o sjcd_base=<base_address> - - -Have fun! - -If something is wrong, please email to vadim@rbrf.ru - or vadim@ipsun.ras.ru - or model@cecmow.enet.dec.com - or H.T.M.v.d.Maarel@marin.nl - -It happens sometimes that Vadim is not reachable by mail. For these -instances, Eric van der Maarel will help too. - - Vadim V. Model, Eric van der Maarel, Eberhard Moenkeberg diff --git a/Documentation/cdrom/sonycd535 b/Documentation/cdrom/sonycd535 deleted file mode 100644 index b81e109970aa..000000000000 --- a/Documentation/cdrom/sonycd535 +++ /dev/null @@ -1,122 +0,0 @@ - README FOR LINUX SONY CDU-535/531 DRIVER - ======================================== - -This is the Sony CDU-535 (and 531) driver version 0.7 for Linux. -I do not think I have the documentation to add features like DMA support -so if anyone else wants to pursue it or help me with it, please do. -(I need to see what was done for the CDU-31A driver -- perhaps I can -steal some of that code.) - -This is a Linux device driver for the Sony CDU-535 CDROM drive. This is -one of the older Sony drives with its own interface card (Sony bus). -The DOS driver for this drive is named SONY_CDU.SYS - when you boot DOS -your drive should be identified as a SONY CDU-535. The driver works -with a CDU-531 also. One user reported that the driver worked on drives -OEM'ed by Procomm, drive and interface board were labelled Procomm. - -The Linux driver is based on Corey Minyard's sonycd 0.3 driver for -the CDU-31A. Ron Jeppesen just changed the commands that were sent -to the drive to correspond to the CDU-535 commands and registers. -There were enough changes to let bugs creep in but it seems to be stable. -Ron was able to tar an entire CDROM (should read all blocks) and built -ghostview and xfig off Walnut Creek's X11R5/GNU CDROM. xcdplayer and -workman work with the driver. Others have used the driver without -problems except those dealing with wait loops (fixed in third release). -Like Minyard's original driver this one uses a polled interface (this -is also the default setup for the DOS driver). It has not been tried -with interrupts or DMA enabled on the board. - -REQUIREMENTS -============ - - - Sony CDU-535 drive, preferably without interrupts and DMA - enabled on the card. - - - Drive must be set up as unit 1. Only the first unit will be - recognized - - - You must enter your interface address into - /usr/src/linux/drivers/cdrom/sonycd535.h and build the - appropriate kernel or use the "kernel command line" parameter - sonycd535=0x320 - with the correct interface address. - -NOTES: -====== - -1) The drive MUST be turned on when booting or it will not be recognized! - (but see comments on modularized version below) - -2) when the cdrom device is opened the eject button is disabled to keep the - user from ejecting a mounted disk and replacing it with another. - Unfortunately xcdplayer and workman also open the cdrom device so you - have to use the eject button in the software. Keep this in mind if your - cdrom player refuses to give up its disk -- exit workman or xcdplayer, or - umount the drive if it has been mounted. - -THANKS -====== - -Many thanks to Ron Jeppesen (ronj.an@site007.saic.com) for getting -this project off the ground. He wrote the initial release -and the first two patches to this driver (0.1, 0.2, and 0.3). -Thanks also to Eberhard Moenkeberg (emoenke@gwdg.de) for prodding -me to place this code into the mainstream Linux source tree -(as of Linux version 1.1.91), as well as some patches to make -it a better device citizen. Further thanks to Joel Katz -<joelkatz@webchat.org> for his MODULE patches (see details below), -Porfiri Claudio <C.Porfiri@nisms.tei.ericsson.se> for patches -to make the driver work with the older CDU-510/515 series, and -Heiko Eissfeldt <heiko@colossus.escape.de> for pointing out that -the verify_area() checks were ignoring the results of said checks -(note: verify_area() has since been replaced by access_ok()). - -(Acknowledgments from Ron Jeppesen in the 0.3 release:) -Thanks to Corey Minyard who wrote the original CDU-31A driver on which -this driver is based. Thanks to Ken Pizzini and Bob Blair who provided -patches and feedback on the first release of this driver. - -Ken Pizzini -ken@halcyon.com - ------------------------------------------------------------------------------- -(The following is from Joel Katz <joelkatz@webchat.org>.) - - To build a version of sony535.o that can be installed as a module, -use the following command: - -gcc -c -D__KERNEL__ -DMODULE -O2 sonycd535.c -o sonycd535.o - - To install the module, simply type: - -insmod sony535.o - or -insmod sony535.o sonycd535=<address> - - And to remove it: - -rmmod sony535 - - The code checks to see if MODULE is defined and behaves as it used -to if MODULE is not defined. That means your patched file should behave -exactly as it used to if compiled into the kernel. - - I have an external drive, and I usually leave it powered off. I used -to have to reboot if I needed to use the CDROM drive. Now I don't. - - Even if you have an internal drive, why waste the 96K of memory -(unswappable) that the driver uses if you use your CD-ROM drive infrequently? - - This driver will not install (whether compiled in or loaded as a -module) if the CDROM drive is not available during its initialization. This -means that you can have the driver compiled into the kernel and still load -the module later (assuming the driver doesn't install itself during -power-on). This only wastes 12K when you boot with the CDROM drive off. - - This is what I usually do; I leave the driver compiled into the -kernel, but load it as a module if I powered the system up with the drive -off and then later decided to use the CDROM drive. - - Since the driver only uses a single page to point to the chunks, -attempting to set the buffer cache to more than 2 Megabytes would be very -bad; don't do that. diff --git a/Documentation/connector/cn_test.c b/Documentation/connector/cn_test.c index 3e73231695b3..be7af146dd30 100644 --- a/Documentation/connector/cn_test.c +++ b/Documentation/connector/cn_test.c @@ -124,9 +124,8 @@ static void cn_test_timer_func(unsigned long __data) struct cn_msg *m; char data[32]; - m = kmalloc(sizeof(*m) + sizeof(data), GFP_ATOMIC); + m = kzalloc(sizeof(*m) + sizeof(data), GFP_ATOMIC); if (m) { - memset(m, 0, sizeof(*m) + sizeof(data)); memcpy(&m->id, &cn_test_id, sizeof(m->id)); m->seq = cn_test_timer_counter; diff --git a/Documentation/console/console.txt b/Documentation/console/console.txt index d3e17447321c..877a1b26cc3d 100644 --- a/Documentation/console/console.txt +++ b/Documentation/console/console.txt @@ -29,7 +29,7 @@ In newer kernels, the following are also available: If sysfs is enabled, the contents of /sys/class/vtconsole can be examined. This shows the console backends currently registered by the -system which are named vtcon<n> where <n> is an integer fro 0 to 15. Thus: +system which are named vtcon<n> where <n> is an integer from 0 to 15. Thus: ls /sys/class/vtconsole . .. vtcon0 vtcon1 diff --git a/Documentation/driver-model/devres.txt b/Documentation/driver-model/devres.txt index 6c8d8f27db34..8569072fa387 100644 --- a/Documentation/driver-model/devres.txt +++ b/Documentation/driver-model/devres.txt @@ -207,7 +207,7 @@ responsibility. This is usually non-issue because bus ops and resource allocations already do the job. For an example of single-instance devres type, read pcim_iomap_table() -in lib/iomap.c. +in lib/devres.c. All devres interface functions can be called without context if the right gfp mask is given. diff --git a/Documentation/driver-model/platform.txt b/Documentation/driver-model/platform.txt index 19c4a6e13676..2a97320ee17f 100644 --- a/Documentation/driver-model/platform.txt +++ b/Documentation/driver-model/platform.txt @@ -96,6 +96,46 @@ System setup also associates those clocks with the device, so that that calls to clk_get(&pdev->dev, clock_name) return them as needed. +Legacy Drivers: Device Probing +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Some drivers are not fully converted to the driver model, because they take +on a non-driver role: the driver registers its platform device, rather than +leaving that for system infrastructure. Such drivers can't be hotplugged +or coldplugged, since those mechanisms require device creation to be in a +different system component than the driver. + +The only "good" reason for this is to handle older system designs which, like +original IBM PCs, rely on error-prone "probe-the-hardware" models for hardware +configuration. Newer systems have largely abandoned that model, in favor of +bus-level support for dynamic configuration (PCI, USB), or device tables +provided by the boot firmware (e.g. PNPACPI on x86). There are too many +conflicting options about what might be where, and even educated guesses by +an operating system will be wrong often enough to make trouble. + +This style of driver is discouraged. If you're updating such a driver, +please try to move the device enumeration to a more appropriate location, +outside the driver. This will usually be cleanup, since such drivers +tend to already have "normal" modes, such as ones using device nodes that +were created by PNP or by platform device setup. + +None the less, there are some APIs to support such legacy drivers. Avoid +using these calls except with such hotplug-deficient drivers. + + struct platform_device *platform_device_alloc( + char *name, unsigned id); + +You can use platform_device_alloc() to dynamically allocate a device, which +you will then initialize with resources and platform_device_register(). +A better solution is usually: + + struct platform_device *platform_device_register_simple( + char *name, unsigned id, + struct resource *res, unsigned nres); + +You can use platform_device_register_simple() as a one-step call to allocate +and register a device. + + Device Naming and Driver Binding ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The platform_device.dev.bus_id is the canonical name for the devices. diff --git a/Documentation/drivers/edac/edac.txt b/Documentation/drivers/edac/edac.txt index 3c5a9e4297b4..a5c36842ecef 100644 --- a/Documentation/drivers/edac/edac.txt +++ b/Documentation/drivers/edac/edac.txt @@ -2,22 +2,42 @@ EDAC - Error Detection And Correction -Written by Doug Thompson <norsk5@xmission.com> +Written by Doug Thompson <dougthompson@xmission.com> 7 Dec 2005 +17 Jul 2007 Updated -EDAC was written by: - Thayne Harbaugh, - modified by Dave Peterson, Doug Thompson, et al, - from the bluesmoke.sourceforge.net project. +EDAC is maintained and written by: + Doug Thompson, Dave Jiang, Dave Peterson et al, + original author: Thayne Harbaugh, + +Contact: + website: bluesmoke.sourceforge.net + mailing list: bluesmoke-devel@lists.sourceforge.net + +"bluesmoke" was the name for this device driver when it was "out-of-tree" +and maintained at sourceforge.net. When it was pushed into 2.6.16 for the +first time, it was renamed to 'EDAC'. + +The bluesmoke project at sourceforge.net is now utilized as a 'staging area' +for EDAC development, before it is sent upstream to kernel.org + +At the bluesmoke/EDAC project site, is a series of quilt patches against +recent kernels, stored in a SVN respository. For easier downloading, there +is also a tarball snapshot available. ============================================================================ EDAC PURPOSE The 'edac' kernel module goal is to detect and report errors that occur -within the computer system. In the initial release, memory Correctable Errors -(CE) and Uncorrectable Errors (UE) are the primary errors being harvested. +within the computer system running under linux. + +MEMORY + +In the initial release, memory Correctable Errors (CE) and Uncorrectable +Errors (UE) are the primary errors being harvested. These types of errors +are harvested by the 'edac_mc' class of device. Detecting CE events, then harvesting those events and reporting them, CAN be a predictor of future UE events. With CE events, the system can @@ -25,9 +45,27 @@ continue to operate, but with less safety. Preventive maintenance and proactive part replacement of memory DIMMs exhibiting CEs can reduce the likelihood of the dreaded UE events and system 'panics'. +NON-MEMORY + +A new feature for EDAC, the edac_device class of device, was added in +the 2.6.23 version of the kernel. + +This new device type allows for non-memory type of ECC hardware detectors +to have their states harvested and presented to userspace via the sysfs +interface. + +Some architectures have ECC detectors for L1, L2 and L3 caches, along with DMA +engines, fabric switches, main data path switches, interconnections, +and various other hardware data paths. If the hardware reports it, then +a edac_device device probably can be constructed to harvest and present +that to userspace. + + +PCI BUS SCANNING In addition, PCI Bus Parity and SERR Errors are scanned for on PCI devices in order to determine if errors are occurring on data transfers. + The presence of PCI Parity errors must be examined with a grain of salt. There are several add-in adapters that do NOT follow the PCI specification with regards to Parity generation and reporting. The specification says @@ -35,11 +73,17 @@ the vendor should tie the parity status bits to 0 if they do not intend to generate parity. Some vendors do not do this, and thus the parity bit can "float" giving false positives. -[There are patches in the kernel queue which will allow for storage of -quirks of PCI devices reporting false parity positives. The 2.6.18 -kernel should have those patches included. When that becomes available, -then EDAC will be patched to utilize that information to "skip" such -devices.] +In the kernel there is a pci device attribute located in sysfs that is +checked by the EDAC PCI scanning code. If that attribute is set, +PCI parity/error scannining is skipped for that device. The attribute +is: + + broken_parity_status + +as is located in /sys/devices/pci<XXX>/0000:XX:YY.Z directorys for +PCI devices. + +FUTURE HARDWARE SCANNING EDAC will have future error detectors that will be integrated with EDAC or added to it, in the following list: @@ -57,13 +101,14 @@ and the like. ============================================================================ EDAC VERSIONING -EDAC is composed of a "core" module (edac_mc.ko) and several Memory +EDAC is composed of a "core" module (edac_core.ko) and several Memory Controller (MC) driver modules. On a given system, the CORE is loaded and one MC driver will be loaded. Both the CORE and -the MC driver have individual versions that reflect current release -level of their respective modules. Thus, to "report" on what version -a system is running, one must report both the CORE's and the -MC driver's versions. +the MC driver (or edac_device driver) have individual versions that reflect +current release level of their respective modules. + +Thus, to "report" on what version a system is running, one must report both +the CORE's and the MC driver's versions. LOADING @@ -88,8 +133,9 @@ EDAC sysfs INTERFACE EDAC presents a 'sysfs' interface for control, reporting and attribute reporting purposes. -EDAC lives in the /sys/devices/system/edac directory. Within this directory -there currently reside 2 'edac' components: +EDAC lives in the /sys/devices/system/edac directory. + +Within this directory there currently reside 2 'edac' components: mc memory controller(s) system pci PCI control and status system @@ -188,7 +234,7 @@ In directory 'mc' are EDAC system overall control and attribute files: Panic on UE control file: - 'panic_on_ue' + 'edac_mc_panic_on_ue' An uncorrectable error will cause a machine panic. This is usually desirable. It is a bad idea to continue when an uncorrectable error @@ -199,12 +245,12 @@ Panic on UE control file: LOAD TIME: module/kernel parameter: panic_on_ue=[0|1] - RUN TIME: echo "1" >/sys/devices/system/edac/mc/panic_on_ue + RUN TIME: echo "1" >/sys/devices/system/edac/mc/edac_mc_panic_on_ue Log UE control file: - 'log_ue' + 'edac_mc_log_ue' Generate kernel messages describing uncorrectable errors. These errors are reported through the system message log system. UE statistics @@ -212,12 +258,12 @@ Log UE control file: LOAD TIME: module/kernel parameter: log_ue=[0|1] - RUN TIME: echo "1" >/sys/devices/system/edac/mc/log_ue + RUN TIME: echo "1" >/sys/devices/system/edac/mc/edac_mc_log_ue Log CE control file: - 'log_ce' + 'edac_mc_log_ce' Generate kernel messages describing correctable errors. These errors are reported through the system message log system. @@ -225,12 +271,12 @@ Log CE control file: LOAD TIME: module/kernel parameter: log_ce=[0|1] - RUN TIME: echo "1" >/sys/devices/system/edac/mc/log_ce + RUN TIME: echo "1" >/sys/devices/system/edac/mc/edac_mc_log_ce Polling period control file: - 'poll_msec' + 'edac_mc_poll_msec' The time period, in milliseconds, for polling for error information. Too small a value wastes resources. Too large a value might delay @@ -241,7 +287,7 @@ Polling period control file: LOAD TIME: module/kernel parameter: poll_msec=[0|1] - RUN TIME: echo "1000" >/sys/devices/system/edac/mc/poll_msec + RUN TIME: echo "1000" >/sys/devices/system/edac/mc/edac_mc_poll_msec ============================================================================ @@ -587,3 +633,95 @@ Parity Count: ======================================================================= + + +EDAC_DEVICE type of device + +In the header file, edac_core.h, there is a series of edac_device structures +and APIs for the EDAC_DEVICE. + +User space access to an edac_device is through the sysfs interface. + +At the location /sys/devices/system/edac (sysfs) new edac_device devices will +appear. + +There is a three level tree beneath the above 'edac' directory. For example, +the 'test_device_edac' device (found at the bluesmoke.sourceforget.net website) +installs itself as: + + /sys/devices/systm/edac/test-instance + +in this directory are various controls, a symlink and one or more 'instance' +directorys. + +The standard default controls are: + + log_ce boolean to log CE events + log_ue boolean to log UE events + panic_on_ue boolean to 'panic' the system if an UE is encountered + (default off, can be set true via startup script) + poll_msec time period between POLL cycles for events + +The test_device_edac device adds at least one of its own custom control: + + test_bits which in the current test driver does nothing but + show how it is installed. A ported driver can + add one or more such controls and/or attributes + for specific uses. + One out-of-tree driver uses controls here to allow + for ERROR INJECTION operations to hardware + injection registers + +The symlink points to the 'struct dev' that is registered for this edac_device. + +INSTANCES + +One or more instance directories are present. For the 'test_device_edac' case: + + test-instance0 + + +In this directory there are two default counter attributes, which are totals of +counter in deeper subdirectories. + + ce_count total of CE events of subdirectories + ue_count total of UE events of subdirectories + +BLOCKS + +At the lowest directory level is the 'block' directory. There can be 0, 1 +or more blocks specified in each instance. + + test-block0 + + +In this directory the default attributes are: + + ce_count which is counter of CE events for this 'block' + of hardware being monitored + ue_count which is counter of UE events for this 'block' + of hardware being monitored + + +The 'test_device_edac' device adds 4 attributes and 1 control: + + test-block-bits-0 for every POLL cycle this counter + is incremented + test-block-bits-1 every 10 cycles, this counter is bumped once, + and test-block-bits-0 is set to 0 + test-block-bits-2 every 100 cycles, this counter is bumped once, + and test-block-bits-1 is set to 0 + test-block-bits-3 every 1000 cycles, this counter is bumped once, + and test-block-bits-2 is set to 0 + + + reset-counters writing ANY thing to this control will + reset all the above counters. + + +Use of the 'test_device_edac' driver should any others to create their own +unique drivers for their hardware systems. + +The 'test_device_edac' sample driver is located at the +bluesmoke.sourceforge.net project site for EDAC. + diff --git a/Documentation/dvb/bt8xx.txt b/Documentation/dvb/bt8xx.txt index 4e7614e606c5..ecb47adda063 100644 --- a/Documentation/dvb/bt8xx.txt +++ b/Documentation/dvb/bt8xx.txt @@ -9,19 +9,29 @@ for accessing the i2c bus and the gpio pins of the bt8xx chipset. Please see Documentation/dvb/cards.txt => o Cards based on the Conexant Bt8xx PCI bridge: Compiling kernel please enable: -a.)"Device drivers" => "Multimedia devices" => "Video For Linux" => "BT848 Video For Linux" -b.)"Device drivers" => "Multimedia devices" => "Digital Video Broadcasting Devices" - => "DVB for Linux" "DVB Core Support" "Bt8xx based PCI Cards" +a.)"Device drivers" => "Multimedia devices" => "Video For Linux" => "Enable Video for Linux API 1 (DEPRECATED)" +b.)"Device drivers" => "Multimedia devices" => "Video For Linux" => "Video Capture Adapters" => "BT848 Video For Linux" +c.)"Device drivers" => "Multimedia devices" => "Digital Video Broadcasting Devices" => "DVB for Linux" "DVB Core Support" "Bt8xx based PCI Cards" -2) Loading Modules -================== +Please use the following options with care as deselection of drivers which are in fact necessary +may result in DVB devices that cannot be tuned due to lack of driver support: +You can save RAM by deselecting every frontend module that your DVB card does not need. + +First please remove the static dependency of DVB card drivers on all frontend modules for all possible card variants by enabling: +d.) "Device drivers" => "Multimedia devices" => "Digital Video Broadcasting Devices" + => "DVB for Linux" "DVB Core Support" "Load and attach frontend modules as needed" -In default cases bttv is loaded automatically. -To load the backend either place dvb-bt8xx in etc/modules, or apply manually: +If you know the frontend driver that your card needs please enable: +e.)"Device drivers" => "Multimedia devices" => "Digital Video Broadcasting Devices" + => "DVB for Linux" "DVB Core Support" "Customise DVB Frontends" => "Customise the frontend modules to build" + Then please select your card-specific frontend module. - $ modprobe dvb-bt8xx +2) Loading Modules +================== -All frontends will be loaded automatically. +Regular case: If the bttv driver detects a bt8xx-based DVB card, all frontend and backend modules will be loaded automatically. +Exceptions are: +- Old TwinHan DST cards or clones with or without CA slot and not containing an Eeprom. People running udev please see Documentation/dvb/udev.txt. In the following cases overriding the PCI type detection for dvb-bt8xx might be necessary: @@ -30,7 +40,6 @@ In the following cases overriding the PCI type detection for dvb-bt8xx might be ------------------------------ $ modprobe bttv card=113 - $ modprobe dvb-bt8xx $ modprobe dst Useful parameters for verbosity level and debugging the dst module: @@ -65,10 +74,9 @@ DViCO FusionHDTV 5 Lite: 135 Notice: The order of the card ID should be uprising: Example: $ modprobe bttv card=113 card=135 - $ modprobe dvb-bt8xx For a full list of card ID's please see Documentation/video4linux/CARDLIST.bttv. -In case of further problems send questions to the mailing list: www.linuxdvb.org. +In case of further problems please subscribe and send questions to the mailing list: linux-dvb@linuxtv.org. Authors: Richard Walker, Jamie Honan, diff --git a/Documentation/dvb/get_dvb_firmware b/Documentation/dvb/get_dvb_firmware index 4820366b6ae8..b4d306ae9234 100644 --- a/Documentation/dvb/get_dvb_firmware +++ b/Documentation/dvb/get_dvb_firmware @@ -24,7 +24,8 @@ use IO::Handle; @components = ( "sp8870", "sp887x", "tda10045", "tda10046", "tda10046lifeview", "av7110", "dec2000t", "dec2540t", "dec3000s", "vp7041", "dibusb", "nxt2002", "nxt2004", - "or51211", "or51132_qam", "or51132_vsb", "bluebird"); + "or51211", "or51132_qam", "or51132_vsb", "bluebird", + "opera1"); # Check args syntax() if (scalar(@ARGV) != 1); @@ -56,7 +57,7 @@ syntax(); sub sp8870 { my $sourcefile = "tt_Premium_217g.zip"; - my $url = "http://www.technotrend.de/new/217g/$sourcefile"; + my $url = "http://www.softwarepatch.pl/9999ccd06a4813cb827dbb0005071c71/$sourcefile"; my $hash = "53970ec17a538945a6d8cb608a7b3899"; my $outfile = "dvb-fe-sp8870.fw"; my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1); @@ -210,6 +211,45 @@ sub dec3000s { $outfile; } +sub opera1{ + my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 0); + + checkstandard(); + my $fwfile1="dvb-usb-opera1-fpga-01.fw"; + my $fwfile2="dvb-usb-opera-01.fw"; + extract("2830SCap2.sys", 0x62e8, 55024, "$tmpdir/opera1-fpga.fw"); + extract("2830SLoad2.sys",0x3178,0x3685-0x3178,"$tmpdir/fw1part1"); + extract("2830SLoad2.sys",0x0980,0x3150-0x0980,"$tmpdir/fw1part2"); + delzero("$tmpdir/fw1part1","$tmpdir/fw1part1-1"); + delzero("$tmpdir/fw1part2","$tmpdir/fw1part2-1"); + verify("$tmpdir/fw1part1-1","5e0909858fdf0b5b09ad48b9fe622e70"); + verify("$tmpdir/fw1part2-1","d6e146f321427e931df2c6fcadac37a1"); + verify("$tmpdir/opera1-fpga.fw","0f8133f5e9051f5f3c1928f7e5a1b07d"); + + my $RES1="\x01\x92\x7f\x00\x01\x00"; + my $RES0="\x01\x92\x7f\x00\x00\x00"; + my $DAT1="\x01\x00\xe6\x00\x01\x00"; + my $DAT0="\x01\x00\xe6\x00\x00\x00"; + open FW,">$tmpdir/opera.fw"; + print FW "$RES1"; + print FW "$DAT1"; + print FW "$RES1"; + print FW "$DAT1"; + appendfile(FW,"$tmpdir/fw1part1-1"); + print FW "$RES0"; + print FW "$DAT0"; + print FW "$RES1"; + print FW "$DAT1"; + appendfile(FW,"$tmpdir/fw1part2-1"); + print FW "$RES1"; + print FW "$DAT1"; + print FW "$RES0"; + print FW "$DAT0"; + copy ("$tmpdir/opera1-fpga.fw",$fwfile1); + copy ("$tmpdir/opera.fw",$fwfile2); + + $fwfile1.",".$fwfile2; +} sub vp7041 { my $sourcefile = "2.422.zip"; @@ -440,6 +480,25 @@ sub appendfile { close(INFILE); } +sub delzero{ + my ($infile,$outfile) =@_; + + open INFILE,"<$infile"; + open OUTFILE,">$outfile"; + while (1){ + $rcount=sysread(INFILE,$buf,22); + $len=ord(substr($buf,0,1)); + print OUTFILE substr($buf,0,1); + print OUTFILE substr($buf,2,$len+3); + last if ($rcount<1); + printf OUTFILE "%c",0; +#print $len." ".length($buf)."\n"; + + } + close(INFILE); + close(OUTFILE); +} + sub syntax() { print STDERR "syntax: get_dvb_firmware <component>\n"; print STDERR "Supported components:\n"; diff --git a/Documentation/dvb/opera-firmware.txt b/Documentation/dvb/opera-firmware.txt new file mode 100644 index 000000000000..93e784c2607b --- /dev/null +++ b/Documentation/dvb/opera-firmware.txt @@ -0,0 +1,27 @@ +To extract the firmware for the Opera DVB-S1 USB-Box +you need to copy the files: + +2830SCap2.sys +2830SLoad2.sys + +from the windriver disk into this directory. + +Then run + +./get_dvb_firware opera1 + +and after that you have 2 files: + +dvb-usb-opera-01.fw +dvb-usb-opera1-fpga-01.fw + +in here. + +Copy them into /lib/firmware/ . + +After that the driver can load the firmware +(if you have enabled firmware loading +in kernel config and have hotplug running). + + +Marco Gittler <g.marco@freenet.de>
\ No newline at end of file diff --git a/Documentation/fault-injection/failcmd.sh b/Documentation/fault-injection/failcmd.sh deleted file mode 100644 index 63177aba8106..000000000000 --- a/Documentation/fault-injection/failcmd.sh +++ /dev/null @@ -1,4 +0,0 @@ -#!/bin/bash - -echo 1 > /proc/self/make-it-fail -exec $* diff --git a/Documentation/fault-injection/failmodule.sh b/Documentation/fault-injection/failmodule.sh deleted file mode 100644 index 474a8b971f9c..000000000000 --- a/Documentation/fault-injection/failmodule.sh +++ /dev/null @@ -1,31 +0,0 @@ -#!/bin/bash -# -# Usage: failmodule <failname> <modulename> [stacktrace-depth] -# -# <failname>: "failslab", "fail_alloc_page", or "fail_make_request" -# -# <modulename>: module name that you want to inject faults. -# -# [stacktrace-depth]: the maximum number of stacktrace walking allowed -# - -STACKTRACE_DEPTH=5 -if [ $# -gt 2 ]; then - STACKTRACE_DEPTH=$3 -fi - -if [ ! -d /debug/$1 ]; then - echo "Fault-injection $1 does not exist" >&2 - exit 1 -fi -if [ ! -d /sys/module/$2 ]; then - echo "Module $2 does not exist" >&2 - exit 1 -fi - -# Disable any fault injection -echo 0 > /debug/$1/stacktrace-depth - -echo `cat /sys/module/$2/sections/.text` > /debug/$1/require-start -echo `cat /sys/module/$2/sections/.exit.text` > /debug/$1/require-end -echo $STACKTRACE_DEPTH > /debug/$1/stacktrace-depth diff --git a/Documentation/fault-injection/fault-injection.txt b/Documentation/fault-injection/fault-injection.txt index b7ca560b9340..4bc374a14345 100644 --- a/Documentation/fault-injection/fault-injection.txt +++ b/Documentation/fault-injection/fault-injection.txt @@ -103,6 +103,11 @@ configuration of fault-injection capabilities. default is 'N', setting it to 'Y' will inject failures only into non-sleep allocations (GFP_ATOMIC allocations). +- /debug/fail_page_alloc/min-order: + + specifies the minimum page allocation order to be injected + failures. + o Boot option In order to inject faults while debugfs is not available (early boot time), @@ -156,70 +161,77 @@ o add a hook to insert failures Application Examples -------------------- -o inject slab allocation failures into module init/cleanup code +o Inject slab allocation failures into module init/exit code ------------------------------------------------------------------------------- #!/bin/bash -FAILCMD=Documentation/fault-injection/failcmd.sh -BLACKLIST="root_plug evbug" - -FAILNAME=failslab -echo Y > /debug/$FAILNAME/task-filter -echo 10 > /debug/$FAILNAME/probability -echo 100 > /debug/$FAILNAME/interval -echo -1 > /debug/$FAILNAME/times -echo 2 > /debug/$FAILNAME/verbose -echo 1 > /debug/$FAILNAME/ignore-gfp-wait +FAILTYPE=failslab +echo Y > /debug/$FAILTYPE/task-filter +echo 10 > /debug/$FAILTYPE/probability +echo 100 > /debug/$FAILTYPE/interval +echo -1 > /debug/$FAILTYPE/times +echo 0 > /debug/$FAILTYPE/space +echo 2 > /debug/$FAILTYPE/verbose +echo 1 > /debug/$FAILTYPE/ignore-gfp-wait -blacklist() +faulty_system() { - echo $BLACKLIST | grep $1 > /dev/null 2>&1 + bash -c "echo 1 > /proc/self/make-it-fail && exec $*" } -oops() -{ - dmesg | grep BUG > /dev/null 2>&1 -} +if [ $# -eq 0 ] +then + echo "Usage: $0 modulename [ modulename ... ]" + exit 1 +fi + +for m in $* +do + echo inserting $m... + faulty_system modprobe $m -find /lib/modules/`uname -r` -name '*.ko' -exec basename {} .ko \; | - while read i - do - oops && exit 1 - - if ! blacklist $i - then - echo inserting $i... - bash $FAILCMD modprobe $i - fi - done - -lsmod | awk '{ if ($3 == 0) { print $1 } }' | - while read i - do - oops && exit 1 - - if ! blacklist $i - then - echo removing $i... - bash $FAILCMD modprobe -r $i - fi - done + echo removing $m... + faulty_system modprobe -r $m +done ------------------------------------------------------------------------------ -o inject slab allocation failures only for a specific module +o Inject page allocation failures only for a specific module ------------------------------------------------------------------------------- #!/bin/bash -FAILMOD=Documentation/fault-injection/failmodule.sh +FAILTYPE=fail_page_alloc +module=$1 -echo injecting errors into the module $1... +if [ -z $module ] +then + echo "Usage: $0 <modulename>" + exit 1 +fi -modprobe $1 -bash $FAILMOD failslab $1 10 -echo 25 > /debug/failslab/probability +modprobe $module ------------------------------------------------------------------------------- +if [ ! -d /sys/module/$module/sections ] +then + echo Module $module is not loaded + exit 1 +fi + +cat /sys/module/$module/sections/.text > /debug/$FAILTYPE/require-start +cat /sys/module/$module/sections/.data > /debug/$FAILTYPE/require-end + +echo N > /debug/$FAILTYPE/task-filter +echo 10 > /debug/$FAILTYPE/probability +echo 100 > /debug/$FAILTYPE/interval +echo -1 > /debug/$FAILTYPE/times +echo 0 > /debug/$FAILTYPE/space +echo 2 > /debug/$FAILTYPE/verbose +echo 1 > /debug/$FAILTYPE/ignore-gfp-wait +echo 1 > /debug/$FAILTYPE/ignore-gfp-highmem +echo 10 > /debug/$FAILTYPE/stacktrace-depth + +trap "echo 0 > /debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT + +echo "Injecting errors into the module $module... (interrupt to stop)" +sleep 1000000 diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index 769862197ac8..c175eedadb5f 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -26,9 +26,7 @@ Who: Hans Verkuil <hverkuil@xs4all.nl> and --------------------------- -What: /sys/devices/.../power/state - dev->power.power_state - dpm_runtime_{suspend,resume)() +What: dev->power.power_state When: July 2007 Why: Broken design for runtime control over driver power states, confusing driver-internal runtime power management with: mechanisms to support @@ -41,24 +39,6 @@ Who: Pavel Machek <pavel@suse.cz> --------------------------- -What: RAW driver (CONFIG_RAW_DRIVER) -When: December 2005 -Why: declared obsolete since kernel 2.6.3 - O_DIRECT can be used instead -Who: Adrian Bunk <bunk@stusta.de> - ---------------------------- - -What: raw1394: requests of type RAW1394_REQ_ISO_SEND, RAW1394_REQ_ISO_LISTEN -When: June 2007 -Why: Deprecated in favour of the more efficient and robust rawiso interface. - Affected are applications which use the deprecated part of libraw1394 - (raw1394_iso_write, raw1394_start_iso_write, raw1394_start_iso_rcv, - raw1394_stop_iso_rcv) or bypass libraw1394. -Who: Dan Dennedy <dan@dennedy.org>, Stefan Richter <stefanr@s5r6.in-berlin.de> - ---------------------------- - What: old NCR53C9x driver When: October 2007 Why: Replaced by the much better esp_scsi driver. Actual low-level @@ -71,6 +51,7 @@ Who: David Miller <davem@davemloft.net> What: Video4Linux API 1 ioctls and video_decoder.h from Video devices. When: December 2006 Files: include/linux/video_decoder.h +Check: include/linux/video_decoder.h Why: V4L1 AP1 was replaced by V4L2 API. during migration from 2.4 to 2.6 series. The old API have lots of drawbacks and don't provide enough means to work with all video and audio standards. The newer API is @@ -104,6 +85,7 @@ Who: Dominik Brodowski <linux@brodo.de> What: remove EXPORT_SYMBOL(kernel_thread) When: August 2006 Files: arch/*/kernel/*_ksyms.c +Check: kernel_thread Why: kernel_thread is a low-level implementation detail. Drivers should use the <linux/kthread.h> API instead which shields them from implementation details and provides a higherlevel interface that @@ -128,13 +110,6 @@ Who: Adrian Bunk <bunk@stusta.de> --------------------------- -What: drivers depending on OSS_OBSOLETE_DRIVER -When: options in 2.6.20, code in 2.6.22 -Why: OSS drivers with ALSA replacements -Who: Adrian Bunk <bunk@stusta.de> - ---------------------------- - What: Unused EXPORT_SYMBOL/EXPORT_SYMBOL_GPL exports (temporary transition config option provided until then) The transition config option will also be removed at the same time. @@ -161,6 +136,15 @@ Who: Greg Kroah-Hartman <gregkh@suse.de> --------------------------- +What: vm_ops.nopage +When: Soon, provided in-kernel callers have been converted +Why: This interface is replaced by vm_ops.fault, but it has been around + forever, is used by a lot of drivers, and doesn't cost much to + maintain. +Who: Nick Piggin <npiggin@suse.de> + +--------------------------- + What: Interrupt only SA_* flags When: September 2007 Why: The interrupt related SA_* flags are replaced by IRQF_* to move them @@ -180,15 +164,6 @@ Who: Kay Sievers <kay.sievers@suse.de> --------------------------- -What: i2c-isa -When: December 2006 -Why: i2c-isa is a non-sense and doesn't fit in the device driver - model. Drivers relying on it are better implemented as platform - drivers. -Who: Jean Delvare <khali@linux-fr.org> - ---------------------------- - What: i2c_adapter.list When: July 2007 Why: Superfluous, this list duplicates the one maintained by the driver @@ -205,28 +180,6 @@ Who: Adrian Bunk <bunk@stusta.de> --------------------------- -What: ACPI hooks (X86_SPEEDSTEP_CENTRINO_ACPI) in speedstep-centrino driver -When: December 2006 -Why: Speedstep-centrino driver with ACPI hooks and acpi-cpufreq driver are - functionally very much similar. They talk to ACPI in same way. Only - difference between them is the way they do frequency transitions. - One uses MSRs and the other one uses IO ports. Functionaliy of - speedstep_centrino with ACPI hooks is now merged into acpi-cpufreq. - That means one common driver will support all Intel Enhanced Speedstep - capable CPUs. That means less confusion over name of - speedstep-centrino driver (with that driver supposed to be used on - non-centrino platforms). That means less duplication of code and - less maintenance effort and no possibility of these two drivers - going out of sync. - Current users of speedstep_centrino with ACPI hooks are requested to - switch over to acpi-cpufreq driver. speedstep-centrino will continue - to work using older non-ACPI static table based scheme even after this - date. - -Who: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> - ---------------------------- - What: ACPI procfs interface When: July 2008 Why: ACPI sysfs conversion should be finished by January 2008. @@ -244,14 +197,6 @@ Who: Len Brown <len.brown@intel.com> --------------------------- -What: sk98lin network driver -When: July 2007 -Why: In kernel tree version of driver is unmaintained. Sk98lin driver - replaced by the skge driver. -Who: Stephen Hemminger <shemminger@osdl.org> - ---------------------------- - What: Compaq touchscreen device emulation When: Oct 2007 Files: drivers/input/tsdev.c @@ -266,25 +211,6 @@ Who: Richard Purdie <rpurdie@rpsys.net> --------------------------- -What: Multipath cached routing support in ipv4 -When: in 2.6.23 -Why: Code was merged, then submitter immediately disappeared leaving - us with no maintainer and lots of bugs. The code should not have - been merged in the first place, and many aspects of it's - implementation are blocking more critical core networking - development. It's marked EXPERIMENTAL and no distribution - enables it because it cause obscure crashes due to unfixable bugs - (interfaces don't return errors so memory allocation can't be - handled, calling contexts of these interfaces make handling - errors impossible too because they get called after we've - totally commited to creating a route object, for example). - This problem has existed for years and no forward progress - has ever been made, and nobody steps up to try and salvage - this code, so we're going to finally just get rid of it. -Who: David S. Miller <davem@davemloft.net> - ---------------------------- - What: read_dev_chars(), read_conf_data{,_lpm}() (s390 common I/O layer) When: December 2007 Why: These functions are a leftover from 2.4 times. They have several @@ -309,6 +235,14 @@ Who: Jean Delvare <khali@linux-fr.org> --------------------------- +What: 'time' kernel boot parameter +When: January 2008 +Why: replaced by 'printk.time=<value>' so that printk timestamps can be + enabled or disabled as needed +Who: Randy Dunlap <randy.dunlap@oracle.com> + +--------------------------- + What: drivers depending on OSS_OBSOLETE When: options in 2.6.23, code in 2.6.25 Why: obsolete OSS drivers @@ -334,3 +268,41 @@ Who: Tejun Heo <htejun@gmail.com> --------------------------- +What: Legacy RTC drivers (under drivers/i2c/chips) +When: November 2007 +Why: Obsolete. We have a RTC subsystem with better drivers. +Who: Jean Delvare <khali@linux-fr.org> + +--------------------------- + +What: iptables SAME target +When: 1.1. 2008 +Files: net/ipv4/netfilter/ipt_SAME.c, include/linux/netfilter_ipv4/ipt_SAME.h +Why: Obsolete for multiple years now, NAT core provides the same behaviour. + Unfixable broken wrt. 32/64 bit cleanness. +Who: Patrick McHardy <kaber@trash.net> + +--------------------------- + +What: The arch/ppc and include/asm-ppc directories +When: Jun 2008 +Why: The arch/powerpc tree is the merged architecture for ppc32 and ppc64 + platforms. Currently there are efforts underway to port the remaining + arch/ppc platforms to the merged tree. New submissions to the arch/ppc + tree have been frozen with the 2.6.22 kernel release and that tree will + remain in bug-fix only mode until its scheduled removal. Platforms + that are not ported by June 2008 will be removed due to the lack of an + interested maintainer. +Who: linuxppc-dev@ozlabs.org + +--------------------------- + +What: mthca driver's MSI support +When: January 2008 +Files: drivers/infiniband/hw/mthca/*.[ch] +Why: All mthca hardware also supports MSI-X, which provides + strictly more functionality than MSI. So there is no point in + having both MSI-X and MSI support in the driver. +Who: Roland Dreier <rolandd@cisco.com> + +--------------------------- diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index d866551be037..f0f825808ca4 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -510,13 +510,24 @@ More details about quota locking can be found in fs/dquot.c. prototypes: void (*open)(struct vm_area_struct*); void (*close)(struct vm_area_struct*); + int (*fault)(struct vm_area_struct*, struct vm_fault *); struct page *(*nopage)(struct vm_area_struct*, unsigned long, int *); + int (*page_mkwrite)(struct vm_area_struct *, struct page *); locking rules: - BKL mmap_sem + BKL mmap_sem PageLocked(page) open: no yes close: no yes +fault: no yes nopage: no yes +page_mkwrite: no yes no + + ->page_mkwrite() is called when a previously read-only page is +about to become writeable. The file system is responsible for +protecting against truncate races. Once appropriate action has been +taking to lock out truncate, the page range should be verified to be +within i_size. The page mapping should also be checked that it is not +NULL. ================================================================================ Dubious stuff diff --git a/Documentation/filesystems/configfs/configfs.txt b/Documentation/filesystems/configfs/configfs.txt index b34cdb50eab4..d1b98257d000 100644 --- a/Documentation/filesystems/configfs/configfs.txt +++ b/Documentation/filesystems/configfs/configfs.txt @@ -238,6 +238,8 @@ config_item_type. struct config_group *(*make_group)(struct config_group *group, const char *name); int (*commit_item)(struct config_item *item); + void (*disconnect_notify)(struct config_group *group, + struct config_item *item); void (*drop_item)(struct config_group *group, struct config_item *item); }; @@ -268,6 +270,16 @@ the item in other threads, the memory is safe. It may take some time for the item to actually disappear from the subsystem's usage. But it is gone from configfs. +When drop_item() is called, the item's linkage has already been torn +down. It no longer has a reference on its parent and has no place in +the item hierarchy. If a client needs to do some cleanup before this +teardown happens, the subsystem can implement the +ct_group_ops->disconnect_notify() method. The method is called after +configfs has removed the item from the filesystem view but before the +item is removed from its parent group. Like drop_item(), +disconnect_notify() is void and cannot fail. Client subsystems should +not drop any references here, as they still must do it in drop_item(). + A config_group cannot be removed while it still has child items. This is implemented in the configfs rmdir(2) code. ->drop_item() will not be called, as the item has not been dropped. rmdir(2) will fail, as the @@ -280,18 +292,18 @@ tells configfs to make the subsystem appear in the file tree. struct configfs_subsystem { struct config_group su_group; - struct semaphore su_sem; + struct mutex su_mutex; }; int configfs_register_subsystem(struct configfs_subsystem *subsys); void configfs_unregister_subsystem(struct configfs_subsystem *subsys); - A subsystem consists of a toplevel config_group and a semaphore. + A subsystem consists of a toplevel config_group and a mutex. The group is where child config_items are created. For a subsystem, this group is usually defined statically. Before calling configfs_register_subsystem(), the subsystem must have initialized the group via the usual group _init() functions, and it must also have -initialized the semaphore. +initialized the mutex. When the register call returns, the subsystem is live, and it will be visible via configfs. At that point, mkdir(2) can be called and the subsystem must be ready for it. @@ -303,7 +315,7 @@ subsystem/group and the simple_child item in configfs_example.c It shows a trivial object displaying and storing an attribute, and a simple group creating and destroying these children. -[Hierarchy Navigation and the Subsystem Semaphore] +[Hierarchy Navigation and the Subsystem Mutex] There is an extra bonus that configfs provides. The config_groups and config_items are arranged in a hierarchy due to the fact that they @@ -314,19 +326,19 @@ and config_item->ci_parent structure members. A subsystem can navigate the cg_children list and the ci_parent pointer to see the tree created by the subsystem. This can race with configfs' -management of the hierarchy, so configfs uses the subsystem semaphore to +management of the hierarchy, so configfs uses the subsystem mutex to protect modifications. Whenever a subsystem wants to navigate the hierarchy, it must do so under the protection of the subsystem -semaphore. +mutex. -A subsystem will be prevented from acquiring the semaphore while a newly +A subsystem will be prevented from acquiring the mutex while a newly allocated item has not been linked into this hierarchy. Similarly, it -will not be able to acquire the semaphore while a dropping item has not +will not be able to acquire the mutex while a dropping item has not yet been unlinked. This means that an item's ci_parent pointer will never be NULL while the item is in configfs, and that an item will only be in its parent's cg_children list for the same duration. This allows a subsystem to trust ci_parent and cg_children while they hold the -semaphore. +mutex. [Item Aggregation Via symlink(2)] @@ -386,6 +398,33 @@ As a consequence of this, default_groups cannot be removed directly via rmdir(2). They also are not considered when rmdir(2) on the parent group is checking for children. +[Dependant Subsystems] + +Sometimes other drivers depend on particular configfs items. For +example, ocfs2 mounts depend on a heartbeat region item. If that +region item is removed with rmdir(2), the ocfs2 mount must BUG or go +readonly. Not happy. + +configfs provides two additional API calls: configfs_depend_item() and +configfs_undepend_item(). A client driver can call +configfs_depend_item() on an existing item to tell configfs that it is +depended on. configfs will then return -EBUSY from rmdir(2) for that +item. When the item is no longer depended on, the client driver calls +configfs_undepend_item() on it. + +These API cannot be called underneath any configfs callbacks, as +they will conflict. They can block and allocate. A client driver +probably shouldn't calling them of its own gumption. Rather it should +be providing an API that external subsystems call. + +How does this work? Imagine the ocfs2 mount process. When it mounts, +it asks for a heartbeat region item. This is done via a call into the +heartbeat code. Inside the heartbeat code, the region item is looked +up. Here, the heartbeat code calls configfs_depend_item(). If it +succeeds, then heartbeat knows the region is safe to give to ocfs2. +If it fails, it was being torn down anyway, and heartbeat can gracefully +pass up an error. + [Committable Items] NOTE: Committable items are currently unimplemented. diff --git a/Documentation/filesystems/configfs/configfs_example.c b/Documentation/filesystems/configfs/configfs_example.c index 2d6a14a463e0..25151fd5c2c6 100644 --- a/Documentation/filesystems/configfs/configfs_example.c +++ b/Documentation/filesystems/configfs/configfs_example.c @@ -277,11 +277,10 @@ static struct config_item *simple_children_make_item(struct config_group *group, { struct simple_child *simple_child; - simple_child = kmalloc(sizeof(struct simple_child), GFP_KERNEL); + simple_child = kzalloc(sizeof(struct simple_child), GFP_KERNEL); if (!simple_child) return NULL; - memset(simple_child, 0, sizeof(struct simple_child)); config_item_init_type_name(&simple_child->item, name, &simple_child_type); @@ -364,12 +363,11 @@ static struct config_group *group_children_make_group(struct config_group *group { struct simple_children *simple_children; - simple_children = kmalloc(sizeof(struct simple_children), + simple_children = kzalloc(sizeof(struct simple_children), GFP_KERNEL); if (!simple_children) return NULL; - memset(simple_children, 0, sizeof(struct simple_children)); config_group_init_type_name(&simple_children->group, name, &simple_children_type); @@ -453,7 +451,7 @@ static int __init configfs_example_init(void) subsys = example_subsys[i]; config_group_init(&subsys->su_group); - init_MUTEX(&subsys->su_sem); + mutex_init(&subsys->su_mutex); ret = configfs_register_subsystem(subsys); if (ret) { printk(KERN_ERR "Error %d while registering subsystem %s\n", diff --git a/Documentation/ecryptfs.txt b/Documentation/filesystems/ecryptfs.txt index 01d8a08351ac..01d8a08351ac 100644 --- a/Documentation/ecryptfs.txt +++ b/Documentation/filesystems/ecryptfs.txt diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index 8756a07f4dc3..4a37e25e694c 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -42,6 +42,7 @@ Table of Contents 2.12 /proc/<pid>/oom_adj - Adjust the oom-killer score 2.13 /proc/<pid>/oom_score - Display current oom-killer score 2.14 /proc/<pid>/io - Display the IO accounting fields + 2.15 /proc/<pid>/coredump_filter - Core dump filtering settings ------------------------------------------------------------------------------ Preface @@ -171,7 +172,9 @@ read the file /proc/PID/status: This shows you nearly the same information you would get if you viewed it with the ps command. In fact, ps uses the proc file system to obtain its information. The statm file contains more detailed information about the -process memory usage. Its seven fields are explained in Table 1-2. +process memory usage. Its seven fields are explained in Table 1-2. The stat +file contains details information about the process itself. Its fields are +explained in Table 1-3. Table 1-2: Contents of the statm files (as of 2.6.8-rc3) @@ -188,16 +191,65 @@ Table 1-2: Contents of the statm files (as of 2.6.8-rc3) dt number of dirty pages (always 0 on 2.6) .............................................................................. + +Table 1-3: Contents of the stat files (as of 2.6.22-rc3) +.............................................................................. + Field Content + pid process id + tcomm filename of the executable + state state (R is running, S is sleeping, D is sleeping in an + uninterruptible wait, Z is zombie, T is traced or stopped) + ppid process id of the parent process + pgrp pgrp of the process + sid session id + tty_nr tty the process uses + tty_pgrp pgrp of the tty + flags task flags + min_flt number of minor faults + cmin_flt number of minor faults with child's + maj_flt number of major faults + cmaj_flt number of major faults with child's + utime user mode jiffies + stime kernel mode jiffies + cutime user mode jiffies with child's + cstime kernel mode jiffies with child's + priority priority level + nice nice level + num_threads number of threads + start_time time the process started after system boot + vsize virtual memory size + rss resident set memory size + rsslim current limit in bytes on the rss + start_code address above which program text can run + end_code address below which program text can run + start_stack address of the start of the stack + esp current value of ESP + eip current value of EIP + pending bitmap of pending signals (obsolete) + blocked bitmap of blocked signals (obsolete) + sigign bitmap of ignored signals (obsolete) + sigcatch bitmap of catched signals (obsolete) + wchan address where process went to sleep + 0 (place holder) + 0 (place holder) + exit_signal signal to send to parent thread on exit + task_cpu which CPU the task is scheduled on + rt_priority realtime priority + policy scheduling policy (man sched_setscheduler) + blkio_ticks time spent waiting for block IO +.............................................................................. + + 1.2 Kernel data --------------- Similar to the process entries, the kernel data files give information about the running kernel. The files used to obtain this information are contained in -/proc and are listed in Table 1-3. Not all of these will be present in your +/proc and are listed in Table 1-4. Not all of these will be present in your system. It depends on the kernel configuration and the loaded modules, which files are there, and which are missing. -Table 1-3: Kernel info in /proc +Table 1-4: Kernel info in /proc .............................................................................. File Content apm Advanced power management info @@ -473,10 +525,10 @@ IDE devices: More detailed information can be found in the controller specific subdirectories. These are named ide0, ide1 and so on. Each of these -directories contains the files shown in table 1-4. +directories contains the files shown in table 1-5. -Table 1-4: IDE controller info in /proc/ide/ide? +Table 1-5: IDE controller info in /proc/ide/ide? .............................................................................. File Content channel IDE channel (0 or 1) @@ -486,11 +538,11 @@ Table 1-4: IDE controller info in /proc/ide/ide? .............................................................................. Each device connected to a controller has a separate subdirectory in the -controllers directory. The files listed in table 1-5 are contained in these +controllers directory. The files listed in table 1-6 are contained in these directories. -Table 1-5: IDE device information +Table 1-6: IDE device information .............................................................................. File Content cache The cache @@ -1014,6 +1066,13 @@ check the amount of free space (value is in seconds). Default settings are: 4, resume it if we have a value of 3 or more percent; consider information about the amount of free space valid for 30 seconds +audit_argv_kb +------------- + +The file contains a single value denoting the limit on the argv array size +for execve (in KiB). This limit is only applied when system call auditing for +execve is enabled, otherwise the value is ignored. + ctrl-alt-del ------------ @@ -1297,6 +1356,21 @@ nr_hugepages configures number of hugetlb page reserved for the system. hugetlb_shm_group contains group id that is allowed to create SysV shared memory segment using hugetlb page. +hugepages_treat_as_movable +-------------------------- + +This parameter is only useful when kernelcore= is specified at boot time to +create ZONE_MOVABLE for pages that may be reclaimed or migrated. Huge pages +are not movable so are not normally allocated from ZONE_MOVABLE. A non-zero +value written to hugepages_treat_as_movable allows huge pages to be allocated +from ZONE_MOVABLE. + +Once enabled, the ZONE_MOVABLE is treated as an area of memory the huge +pages pool can easily grow or shrink within. Assuming that applications are +not running that mlock() a lot of memory, it is likely the huge pages pool +can grow to the size of ZONE_MOVABLE by repeatedly entering the desired value +into nr_hugepages and triggering page reclaim. + laptop_mode ----------- @@ -2111,4 +2185,41 @@ those 64-bit counters, process A could see an intermediate result. More information about this can be found within the taskstats documentation in Documentation/accounting. +2.15 /proc/<pid>/coredump_filter - Core dump filtering settings +--------------------------------------------------------------- +When a process is dumped, all anonymous memory is written to a core file as +long as the size of the core file isn't limited. But sometimes we don't want +to dump some memory segments, for example, huge shared memory. Conversely, +sometimes we want to save file-backed memory segments into a core file, not +only the individual files. + +/proc/<pid>/coredump_filter allows you to customize which memory segments +will be dumped when the <pid> process is dumped. coredump_filter is a bitmask +of memory types. If a bit of the bitmask is set, memory segments of the +corresponding memory type are dumped, otherwise they are not dumped. + +The following 4 memory types are supported: + - (bit 0) anonymous private memory + - (bit 1) anonymous shared memory + - (bit 2) file-backed private memory + - (bit 3) file-backed shared memory + + Note that MMIO pages such as frame buffer are never dumped and vDSO pages + are always dumped regardless of the bitmask status. + +Default value of coredump_filter is 0x3; this means all anonymous memory +segments are dumped. + +If you don't want to dump all shared memory segments attached to pid 1234, +write 1 to the process's proc file. + + $ echo 0x1 > /proc/1234/coredump_filter + +When a new process is created, the process inherits the bitmask status from its +parent. It is useful to set up coredump_filter before the program runs. +For example: + + $ echo 0x7 > /proc/self/coredump_filter + $ ./some_program + ------------------------------------------------------------------------------ diff --git a/Documentation/filesystems/tmpfs.txt b/Documentation/filesystems/tmpfs.txt index 6dd050878a20..145e44086358 100644 --- a/Documentation/filesystems/tmpfs.txt +++ b/Documentation/filesystems/tmpfs.txt @@ -94,10 +94,10 @@ largest node numbers in the range. For example, mpol=bind:0-3,5,7,9-15 Note that trying to mount a tmpfs with an mpol option will fail if the running kernel does not support NUMA; and will fail if its nodelist -specifies a node >= MAX_NUMNODES. If your system relies on that tmpfs -being mounted, but from time to time runs a kernel built without NUMA -capability (perhaps a safe recovery kernel), or configured to support -fewer nodes, then it is advisable to omit the mpol option from automatic +specifies a node which is not online. If your system relies on that +tmpfs being mounted, but from time to time runs a kernel built without +NUMA capability (perhaps a safe recovery kernel), or with fewer nodes +online, then it is advisable to omit the mpol option from automatic mount options. It can be added later, when the tmpfs is already mounted on MountPoint, by 'mount -o remount,mpol=Policy:NodeList MountPoint'. @@ -121,4 +121,4 @@ RAM/SWAP in 10240 inodes and it is only accessible by root. Author: Christoph Rohland <cr@sap.com>, 1.12.01 Updated: - Hugh Dickins <hugh@veritas.com>, 19 February 2006 + Hugh Dickins <hugh@veritas.com>, 4 June 2007 diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt index a47cc819f37b..045f3e055a28 100644 --- a/Documentation/filesystems/vfs.txt +++ b/Documentation/filesystems/vfs.txt @@ -3,7 +3,7 @@ Original author: Richard Gooch <rgooch@atnf.csiro.au> - Last updated on October 28, 2005 + Last updated on June 24, 2007. Copyright (C) 1999 Richard Gooch Copyright (C) 2005 Pekka Enberg @@ -107,7 +107,7 @@ file /proc/filesystems. struct file_system_type ----------------------- -This describes the filesystem. As of kernel 2.6.13, the following +This describes the filesystem. As of kernel 2.6.22, the following members are defined: struct file_system_type { @@ -119,6 +119,8 @@ struct file_system_type { struct module *owner; struct file_system_type * next; struct list_head fs_supers; + struct lock_class_key s_lock_key; + struct lock_class_key s_umount_key; }; name: the name of the filesystem type, such as "ext2", "iso9660", @@ -137,11 +139,12 @@ struct file_system_type { next: for internal VFS use: you should initialize this to NULL + s_lock_key, s_umount_key: lockdep-specific + The get_sb() method has the following arguments: - struct super_block *sb: the superblock structure. This is partially - initialized by the VFS and the rest must be initialized by the - get_sb() method + struct file_system_type *fs_type: decribes the filesystem, partly initialized + by the specific filesystem code int flags: mount flags @@ -150,12 +153,13 @@ The get_sb() method has the following arguments: void *data: arbitrary mount options, usually comes as an ASCII string - int silent: whether or not to be silent on error + struct vfsmount *mnt: a vfs-internal representation of a mount point The get_sb() method must determine if the block device specified -in the superblock contains a filesystem of the type the method -supports. On success the method returns the superblock pointer, on -failure it returns NULL. +in the dev_name and fs_type contains a filesystem of the type the method +supports. If it succeeds in opening the named block device, it initializes a +struct super_block descriptor for the filesystem contained by the block device. +On failure it returns an error. The most interesting member of the superblock structure that the get_sb() method fills in is the "s_op" field. This is a pointer to @@ -193,7 +197,7 @@ struct super_operations ----------------------- This describes how the VFS can manipulate the superblock of your -filesystem. As of kernel 2.6.13, the following members are defined: +filesystem. As of kernel 2.6.22, the following members are defined: struct super_operations { struct inode *(*alloc_inode)(struct super_block *sb); @@ -216,8 +220,6 @@ struct super_operations { void (*clear_inode) (struct inode *); void (*umount_begin) (struct super_block *); - void (*sync_inodes) (struct super_block *sb, - struct writeback_control *wbc); int (*show_options)(struct seq_file *, struct vfsmount *); ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t); @@ -300,9 +302,6 @@ or bottom half). umount_begin: called when the VFS is unmounting a filesystem. - sync_inodes: called when the VFS is writing out dirty data associated with - a superblock. - show_options: called by the VFS to show mount options for /proc/<pid>/mounts. quota_read: called by the VFS to read from filesystem quota file. @@ -324,7 +323,7 @@ struct inode_operations ----------------------- This describes how the VFS can manipulate an inode in your -filesystem. As of kernel 2.6.13, the following members are defined: +filesystem. As of kernel 2.6.22, the following members are defined: struct inode_operations { int (*create) (struct inode *,struct dentry *,int, struct nameidata *); @@ -348,6 +347,7 @@ struct inode_operations { ssize_t (*getxattr) (struct dentry *, const char *, void *, size_t); ssize_t (*listxattr) (struct dentry *, char *, size_t); int (*removexattr) (struct dentry *, const char *); + void (*truncate_range)(struct inode *, loff_t, loff_t); }; Again, all methods are called without any locks being held, unless @@ -444,6 +444,9 @@ otherwise noted. removexattr: called by the VFS to remove an extended attribute from a file. This method is called by removexattr(2) system call. + truncate_range: a method provided by the underlying filesystem to truncate a + range of blocks , i.e. punch a hole somewhere in a file. + The Address Space Object ======================== @@ -522,7 +525,7 @@ struct address_space_operations ------------------------------- This describes how the VFS can manipulate mapping of a file to page cache in -your filesystem. As of kernel 2.6.16, the following members are defined: +your filesystem. As of kernel 2.6.22, the following members are defined: struct address_space_operations { int (*writepage)(struct page *page, struct writeback_control *wbc); @@ -543,6 +546,7 @@ struct address_space_operations { int); /* migrate the contents of a page to the specified target */ int (*migratepage) (struct page *, struct page *); + int (*launder_page) (struct page *); }; writepage: called by the VM to write a dirty page to backing store. @@ -689,6 +693,10 @@ struct address_space_operations { transfer any private data across and update any references that it has to the page. + launder_page: Called before freeing a page - it writes back the dirty page. To + prevent redirtying the page, it is kept locked during the whole + operation. + The File Object =============== @@ -699,9 +707,10 @@ struct file_operations ---------------------- This describes how the VFS can manipulate an open file. As of kernel -2.6.17, the following members are defined: +2.6.22, the following members are defined: struct file_operations { + struct module *owner; loff_t (*llseek) (struct file *, loff_t, int); ssize_t (*read) (struct file *, char __user *, size_t, loff_t *); ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *); @@ -728,10 +737,8 @@ struct file_operations { int (*check_flags)(int); int (*dir_notify)(struct file *filp, unsigned long arg); int (*flock) (struct file *, int, struct file_lock *); - ssize_t (*splice_write)(struct pipe_inode_info *, struct file *, size_t, unsigned -int); - ssize_t (*splice_read)(struct file *, struct pipe_inode_info *, size_t, unsigned -int); + ssize_t (*splice_write)(struct pipe_inode_info *, struct file *, size_t, unsigned int); + ssize_t (*splice_read)(struct file *, struct pipe_inode_info *, size_t, unsigned int); }; Again, all methods are called without any locks being held, unless diff --git a/Documentation/firmware_class/README b/Documentation/firmware_class/README index e9cc8bb26f7d..c3480aa66ba8 100644 --- a/Documentation/firmware_class/README +++ b/Documentation/firmware_class/README @@ -1,7 +1,7 @@ request_firmware() hotplug interface: ------------------------------------ - Copyright (C) 2003 Manuel Estrada Sainz <ranty@debian.org> + Copyright (C) 2003 Manuel Estrada Sainz Why: --- diff --git a/Documentation/firmware_class/firmware_sample_driver.c b/Documentation/firmware_class/firmware_sample_driver.c index 87feccdb5c9f..6865cbe075ec 100644 --- a/Documentation/firmware_class/firmware_sample_driver.c +++ b/Documentation/firmware_class/firmware_sample_driver.c @@ -1,7 +1,7 @@ /* * firmware_sample_driver.c - * - * Copyright (c) 2003 Manuel Estrada Sainz <ranty@debian.org> + * Copyright (c) 2003 Manuel Estrada Sainz * * Sample code on how to use request_firmware() from drivers. * diff --git a/Documentation/firmware_class/firmware_sample_firmware_class.c b/Documentation/firmware_class/firmware_sample_firmware_class.c index 9e1b0e4051cd..fba943aacf93 100644 --- a/Documentation/firmware_class/firmware_sample_firmware_class.c +++ b/Documentation/firmware_class/firmware_sample_firmware_class.c @@ -1,7 +1,7 @@ /* * firmware_sample_firmware_class.c - * - * Copyright (c) 2003 Manuel Estrada Sainz <ranty@debian.org> + * Copyright (c) 2003 Manuel Estrada Sainz * * NOTE: This is just a probe of concept, if you think that your driver would * be well served by this mechanism please contact me first. @@ -19,7 +19,7 @@ #include <linux/firmware.h> -MODULE_AUTHOR("Manuel Estrada Sainz <ranty@debian.org>"); +MODULE_AUTHOR("Manuel Estrada Sainz"); MODULE_DESCRIPTION("Hackish sample for using firmware class directly"); MODULE_LICENSE("GPL"); @@ -78,6 +78,7 @@ static CLASS_DEVICE_ATTR(loading, 0644, firmware_loading_show, firmware_loading_store); static ssize_t firmware_data_read(struct kobject *kobj, + struct bin_attribute *bin_attr, char *buffer, loff_t offset, size_t count) { struct class_device *class_dev = to_class_dev(kobj); @@ -88,6 +89,7 @@ static ssize_t firmware_data_read(struct kobject *kobj, return count; } static ssize_t firmware_data_write(struct kobject *kobj, + struct bin_attribute *bin_attr, char *buffer, loff_t offset, size_t count) { struct class_device *class_dev = to_class_dev(kobj); diff --git a/Documentation/gpio.txt b/Documentation/gpio.txt index 36af58eba136..218a8650f48d 100644 --- a/Documentation/gpio.txt +++ b/Documentation/gpio.txt @@ -75,6 +75,9 @@ using the include file: If you stick to this convention then it'll be easier for other developers to see what your code is doing, and help maintain it. +Note that these operations include I/O barriers on platforms which need to +use them; drivers don't need to add them explicitly. + Identifying GPIOs ----------------- diff --git a/Documentation/hrtimer/timer_stats.txt b/Documentation/hrtimer/timer_stats.txt index 22b0814d0ad0..20d368c59814 100644 --- a/Documentation/hrtimer/timer_stats.txt +++ b/Documentation/hrtimer/timer_stats.txt @@ -67,3 +67,7 @@ executed on expiry. Thomas, Ingo +Added flag to indicate 'deferrable timer' in /proc/timer_stats. A deferrable +timer will appear as follows + 10D, 1 swapper queue_delayed_work_on (delayed_work_timer_fn) + diff --git a/Documentation/hwmon/abituguru b/Documentation/hwmon/abituguru index b2c0d61b39a2..87ffa0f5ec70 100644 --- a/Documentation/hwmon/abituguru +++ b/Documentation/hwmon/abituguru @@ -2,7 +2,7 @@ Kernel driver abituguru ======================= Supported chips: - * Abit uGuru revision 1-3 (Hardware Monitor part only) + * Abit uGuru revision 1 & 2 (Hardware Monitor part only) Prefix: 'abituguru' Addresses scanned: ISA 0x0E0 Datasheet: Not available, this driver is based on reverse engineering. @@ -20,8 +20,8 @@ Supported chips: uGuru 2.1.0.0 ~ 2.1.2.8 (AS8, AV8, AA8, AG8, AA8XE, AX8) uGuru 2.2.0.0 ~ 2.2.0.6 (AA8 Fatal1ty) uGuru 2.3.0.0 ~ 2.3.0.9 (AN8) - uGuru 3.0.0.0 ~ 3.0.1.2 (AW8, AL8, NI8) - uGuru 4.xxxxx? (AT8 32X) (2) + uGuru 3.0.0.0 ~ 3.0.x.x (AW8, AL8, AT8, NI8 SLI, AT8 32X, AN8 32X, + AW9D-MAX) (2) 1) For revisions 2 and 3 uGuru's the driver can autodetect the sensortype (Volt or Temp) for bank1 sensors, for revision 1 uGuru's this doesnot always work. For these uGuru's the autodection can @@ -30,8 +30,9 @@ Supported chips: bank1_types=1,1,0,0,0,0,0,2,0,0,0,0,2,0,0,1 You may also need to specify the fan_sensors option for these boards fan_sensors=5 - 2) The current version of the abituguru driver is known to NOT work - on these Motherboards + 2) There is a seperate abituguru3 driver for these motherboards, + the abituguru (without the 3 !) driver will not work on these + motherboards (and visa versa)! Authors: Hans de Goede <j.w.r.degoede@hhs.nl>, @@ -43,8 +44,10 @@ Module Parameters ----------------- * force: bool Force detection. Note this parameter only causes the - detection to be skipped, if the uGuru can't be read - the module initialization (insmod) will still fail. + detection to be skipped, and thus the insmod to + succeed. If the uGuru can't be read the actual hwmon + driver will not load and thus no hwmon device will get + registered. * bank1_types: int[] Bank1 sensortype autodetection override: -1 autodetect (default) 0 volt sensor @@ -69,13 +72,15 @@ dmesg | grep abituguru Description ----------- -This driver supports the hardware monitoring features of the Abit uGuru chip -found on Abit uGuru featuring motherboards (most modern Abit motherboards). +This driver supports the hardware monitoring features of the first and +second revision of the Abit uGuru chip found on Abit uGuru featuring +motherboards (most modern Abit motherboards). -The uGuru chip in reality is a Winbond W83L950D in disguise (despite Abit -claiming it is "a new microprocessor designed by the ABIT Engineers"). -Unfortunatly this doesn't help since the W83L950D is a generic -microcontroller with a custom Abit application running on it. +The first and second revision of the uGuru chip in reality is a Winbond +W83L950D in disguise (despite Abit claiming it is "a new microprocessor +designed by the ABIT Engineers"). Unfortunatly this doesn't help since the +W83L950D is a generic microcontroller with a custom Abit application running +on it. Despite Abit not releasing any information regarding the uGuru, Olle Sandberg <ollebull@gmail.com> has managed to reverse engineer the sensor part diff --git a/Documentation/hwmon/abituguru3 b/Documentation/hwmon/abituguru3 new file mode 100644 index 000000000000..fa598aac22fa --- /dev/null +++ b/Documentation/hwmon/abituguru3 @@ -0,0 +1,65 @@ +Kernel driver abituguru3 +======================== + +Supported chips: + * Abit uGuru revision 3 (Hardware Monitor part, reading only) + Prefix: 'abituguru3' + Addresses scanned: ISA 0x0E0 + Datasheet: Not available, this driver is based on reverse engineering. + Note: + The uGuru is a microcontroller with onboard firmware which programs + it to behave as a hwmon IC. There are many different revisions of the + firmware and thus effectivly many different revisions of the uGuru. + Below is an incomplete list with which revisions are used for which + Motherboards: + uGuru 1.00 ~ 1.24 (AI7, KV8-MAX3, AN7) + uGuru 2.0.0.0 ~ 2.0.4.2 (KV8-PRO) + uGuru 2.1.0.0 ~ 2.1.2.8 (AS8, AV8, AA8, AG8, AA8XE, AX8) + uGuru 2.3.0.0 ~ 2.3.0.9 (AN8) + uGuru 3.0.0.0 ~ 3.0.x.x (AW8, AL8, AT8, NI8 SLI, AT8 32X, AN8 32X, + AW9D-MAX) + The abituguru3 driver is only for revison 3.0.x.x motherboards, + this driver will not work on older motherboards. For older + motherboards use the abituguru (without the 3 !) driver. + +Authors: + Hans de Goede <j.w.r.degoede@hhs.nl>, + (Initial reverse engineering done by Louis Kruger) + + +Module Parameters +----------------- + +* force: bool Force detection. Note this parameter only causes the + detection to be skipped, and thus the insmod to + succeed. If the uGuru can't be read the actual hwmon + driver will not load and thus no hwmon device will get + registered. +* verbose: bool Should the driver be verbose? + 0/off/false normal output + 1/on/true + verbose error reporting (default) + Default: 1 (the driver is still in the testing phase) + +Description +----------- + +This driver supports the hardware monitoring features of the third revision of +the Abit uGuru chip, found on recent Abit uGuru featuring motherboards. + +The 3rd revision of the uGuru chip in reality is a Winbond W83L951G. +Unfortunatly this doesn't help since the W83L951G is a generic microcontroller +with a custom Abit application running on it. + +Despite Abit not releasing any information regarding the uGuru revision 3, +Louis Kruger has managed to reverse engineer the sensor part of the uGuru. +Without his work this driver would not have been possible. + +Known Issues +------------ + +The voltage and frequency control parts of the Abit uGuru are not supported, +neither is writing any of the sensor settings and writing / reading the +fanspeed control registers (FanEQ) + +If you encounter any problems please mail me <j.w.r.degoede@hhs.nl> and +include the output of: "dmesg | grep abituguru" diff --git a/Documentation/hwmon/dme1737 b/Documentation/hwmon/dme1737 new file mode 100644 index 000000000000..1a0f3d64ab80 --- /dev/null +++ b/Documentation/hwmon/dme1737 @@ -0,0 +1,257 @@ +Kernel driver dme1737 +===================== + +Supported chips: + * SMSC DME1737 and compatibles (like Asus A8000) + Prefix: 'dme1737' + Addresses scanned: I2C 0x2c, 0x2d, 0x2e + Datasheet: Provided by SMSC upon request and under NDA + +Authors: + Juerg Haefliger <juergh@gmail.com> + + +Module Parameters +----------------- + +* force_start: bool Enables the monitoring of voltage, fan and temp inputs + and PWM output control functions. Using this parameter + shouldn't be required since the BIOS usually takes care + of this. + +Note that there is no need to use this parameter if the driver loads without +complaining. The driver will say so if it is necessary. + + +Description +----------- + +This driver implements support for the hardware monitoring capabilities of the +SMSC DME1737 and Asus A8000 (which are the same) Super-I/O chips. This chip +features monitoring of 3 temp sensors temp[1-3] (2 remote diodes and 1 +internal), 7 voltages in[0-6] (6 external and 1 internal) and 6 fan speeds +fan[1-6]. Additionally, the chip implements 5 PWM outputs pwm[1-3,5-6] for +controlling fan speeds both manually and automatically. + +Fan[3-6] and pwm[3,5-6] are optional features and their availability is +dependent on the configuration of the chip. The driver will detect which +features are present during initialization and create the sysfs attributes +accordingly. + + +Voltage Monitoring +------------------ + +The voltage inputs are sampled with 12-bit resolution and have internal +scaling resistors. The values returned by the driver therefore reflect true +millivolts and don't need scaling. The voltage inputs are mapped as follows +(the last column indicates the input ranges): + + in0: +5VTR (+5V standby) 0V - 6.64V + in1: Vccp (processor core) 0V - 3V + in2: VCC (internal +3.3V) 0V - 4.38V + in3: +5V 0V - 6.64V + in4: +12V 0V - 16V + in5: VTR (+3.3V standby) 0V - 4.38V + in6: Vbat (+3.0V) 0V - 4.38V + +Each voltage input has associated min and max limits which trigger an alarm +when crossed. + + +Temperature Monitoring +---------------------- + +Temperatures are measured with 12-bit resolution and reported in millidegree +Celsius. The chip also features offsets for all 3 temperature inputs which - +when programmed - get added to the input readings. The chip does all the +scaling by itself and the driver therefore reports true temperatures that don't +need any user-space adjustments. The temperature inputs are mapped as follows +(the last column indicates the input ranges): + + temp1: Remote diode 1 (3904 type) temperature -127C - +127C + temp2: DME1737 internal temperature -127C - +127C + temp3: Remote diode 2 (3904 type) temperature -127C - +127C + +Each temperature input has associated min and max limits which trigger an alarm +when crossed. Additionally, each temperature input has a fault attribute that +returns 1 when a faulty diode or an unconnected input is detected and 0 +otherwise. + + +Fan Monitoring +-------------- + +Fan RPMs are measured with 16-bit resolution. The chip provides inputs for 6 +fan tachometers. All 6 inputs have an associated min limit which triggers an +alarm when crossed. Fan inputs 1-4 provide type attributes that need to be set +to the number of pulses per fan revolution that the connected tachometer +generates. Supported values are 1, 2, and 4. Fan inputs 5-6 only support fans +that generate 2 pulses per revolution. Fan inputs 5-6 also provide a max +attribute that needs to be set to the maximum attainable RPM (fan at 100% duty- +cycle) of the input. The chip adjusts the sampling rate based on this value. + + +PWM Output Control +------------------ + +This chip features 5 PWM outputs. PWM outputs 1-3 are associated with fan +inputs 1-3 and PWM outputs 5-6 are associated with fan inputs 5-6. PWM outputs +1-3 can be configured to operate either in manual or automatic mode by setting +the appropriate enable attribute accordingly. PWM outputs 5-6 can only operate +in manual mode, their enable attributes are therefore read-only. When set to +manual mode, the fan speed is set by writing the duty-cycle value to the +appropriate PWM attribute. In automatic mode, the PWM attribute returns the +current duty-cycle as set by the fan controller in the chip. All PWM outputs +support the setting of the output frequency via the freq attribute. + +In automatic mode, the chip supports the setting of the PWM ramp rate which +defines how fast the PWM output is adjusting to changes of the associated +temperature input. Associating PWM outputs to temperature inputs is done via +temperature zones. The chip features 3 zones whose assignments to temperature +inputs is static and determined during initialization. These assignments can +be retrieved via the zone[1-3]_auto_channels_temp attributes. Each PWM output +is assigned to one (or hottest of multiple) temperature zone(s) through the +pwm[1-3]_auto_channels_zone attributes. Each PWM output has 3 distinct output +duty-cycles: full, low, and min. Full is internally hard-wired to 255 (100%) +and low and min can be programmed via pwm[1-3]_auto_point1_pwm and +pwm[1-3]_auto_pwm_min, respectively. The thermal thresholds of the zones are +programmed via zone[1-3]_auto_point[1-3]_temp and +zone[1-3]_auto_point1_temp_hyst: + + pwm[1-3]_auto_point2_pwm full-speed duty-cycle (255, i.e., 100%) + pwm[1-3]_auto_point1_pwm low-speed duty-cycle + pwm[1-3]_auto_pwm_min min-speed duty-cycle + + zone[1-3]_auto_point3_temp full-speed temp (all outputs) + zone[1-3]_auto_point2_temp full-speed temp + zone[1-3]_auto_point1_temp low-speed temp + zone[1-3]_auto_point1_temp_hyst min-speed temp + +The chip adjusts the output duty-cycle linearly in the range of auto_point1_pwm +to auto_point2_pwm if the temperature of the associated zone is between +auto_point1_temp and auto_point2_temp. If the temperature drops below the +auto_point1_temp_hyst value, the output duty-cycle is set to the auto_pwm_min +value which only supports two values: 0 or auto_point1_pwm. That means that the +fan either turns completely off or keeps spinning with the low-speed +duty-cycle. If any of the temperatures rise above the auto_point3_temp value, +all PWM outputs are set to 100% duty-cycle. + +Following is another representation of how the chip sets the output duty-cycle +based on the temperature of the associated thermal zone: + + Duty-Cycle Duty-Cycle + Temperature Rising Temp Falling Temp + ----------- ----------- ------------ + full-speed full-speed full-speed + + < linearly adjusted duty-cycle > + + low-speed low-speed low-speed + min-speed low-speed + min-speed min-speed min-speed + min-speed min-speed + + +Sysfs Attributes +---------------- + +Following is a list of all sysfs attributes that the driver provides, their +permissions and a short description: + +Name Perm Description +---- ---- ----------- +cpu0_vid RO CPU core reference voltage in + millivolts. +vrm RW Voltage regulator module version + number. + +in[0-6]_input RO Measured voltage in millivolts. +in[0-6]_min RW Low limit for voltage input. +in[0-6]_max RW High limit for voltage input. +in[0-6]_alarm RO Voltage input alarm. Returns 1 if + voltage input is or went outside the + associated min-max range, 0 otherwise. + +temp[1-3]_input RO Measured temperature in millidegree + Celsius. +temp[1-3]_min RW Low limit for temp input. +temp[1-3]_max RW High limit for temp input. +temp[1-3]_offset RW Offset for temp input. This value will + be added by the chip to the measured + temperature. +temp[1-3]_alarm RO Alarm for temp input. Returns 1 if temp + input is or went outside the associated + min-max range, 0 otherwise. +temp[1-3]_fault RO Temp input fault. Returns 1 if the chip + detects a faulty thermal diode or an + unconnected temp input, 0 otherwise. + +zone[1-3]_auto_channels_temp RO Temperature zone to temperature input + mapping. This attribute is a bitfield + and supports the following values: + 1: temp1 + 2: temp2 + 4: temp3 +zone[1-3]_auto_point1_temp_hyst RW Auto PWM temp point1 hysteresis. The + output of the corresponding PWM is set + to the pwm_auto_min value if the temp + falls below the auto_point1_temp_hyst + value. +zone[1-3]_auto_point[1-3]_temp RW Auto PWM temp points. Auto_point1 is + the low-speed temp, auto_point2 is the + full-speed temp, and auto_point3 is the + temp at which all PWM outputs are set + to full-speed (100% duty-cycle). + +fan[1-6]_input RO Measured fan speed in RPM. +fan[1-6]_min RW Low limit for fan input. +fan[1-6]_alarm RO Alarm for fan input. Returns 1 if fan + input is or went below the associated + min value, 0 otherwise. +fan[1-4]_type RW Type of attached fan. Expressed in + number of pulses per revolution that + the fan generates. Supported values are + 1, 2, and 4. +fan[5-6]_max RW Max attainable RPM at 100% duty-cycle. + Required for chip to adjust the + sampling rate accordingly. + +pmw[1-3,5-6] RO/RW Duty-cycle of PWM output. Supported + values are 0-255 (0%-100%). Only + writeable if the associated PWM is in + manual mode. +pwm[1-3]_enable RW Enable of PWM outputs 1-3. Supported + values are: + 0: turned off (output @ 100%) + 1: manual mode + 2: automatic mode +pwm[5-6]_enable RO Enable of PWM outputs 5-6. Always + returns 1 since these 2 outputs are + hard-wired to manual mode. +pmw[1-3,5-6]_freq RW Frequency of PWM output. Supported + values are in the range 11Hz-30000Hz + (default is 25000Hz). +pmw[1-3]_ramp_rate RW Ramp rate of PWM output. Determines how + fast the PWM duty-cycle will change + when the PWM is in automatic mode. + Expressed in ms per PWM step. Supported + values are in the range 0ms-206ms + (default is 0, which means the duty- + cycle changes instantly). +pwm[1-3]_auto_channels_zone RW PWM output to temperature zone mapping. + This attribute is a bitfield and + supports the following values: + 1: zone1 + 2: zone2 + 4: zone3 + 6: highest of zone[2-3] + 7: highest of zone[1-3] +pwm[1-3]_auto_pwm_min RW Auto PWM min pwm. Minimum PWM duty- + cycle. Supported values are 0 or + auto_point1_pwm. +pwm[1-3]_auto_point1_pwm RW Auto PWM pwm point. Auto_point1 is the + low-speed duty-cycle. +pwm[1-3]_auto_point2_pwm RO Auto PWM pwm point. Auto_point2 is the + full-speed duty-cycle which is hard- + wired to 255 (100% duty-cycle). diff --git a/Documentation/hwmon/f71805f b/Documentation/hwmon/f71805f index bfd0f154959c..94e0d2cbd3d2 100644 --- a/Documentation/hwmon/f71805f +++ b/Documentation/hwmon/f71805f @@ -5,11 +5,11 @@ Supported chips: * Fintek F71805F/FG Prefix: 'f71805f' Addresses scanned: none, address read from Super I/O config space - Datasheet: Provided by Fintek on request + Datasheet: Available from the Fintek website * Fintek F71872F/FG Prefix: 'f71872f' Addresses scanned: none, address read from Super I/O config space - Datasheet: Provided by Fintek on request + Datasheet: Available from the Fintek website Author: Jean Delvare <khali@linux-fr.org> @@ -128,7 +128,9 @@ it. When the PWM method is used, you can select the operating frequency, from 187.5 kHz (default) to 31 Hz. The best frequency depends on the fan model. As a rule of thumb, lower frequencies seem to give better -control, but may generate annoying high-pitch noise. Fintek recommends +control, but may generate annoying high-pitch noise. So a frequency just +above the audible range, such as 25 kHz, may be a good choice; if this +doesn't give you good linear control, try reducing it. Fintek recommends not going below 1 kHz, as the fan tachometers get confused by lower frequencies as well. @@ -136,16 +138,23 @@ When the DC method is used, Fintek recommends not going below 5 V, which corresponds to a pwm value of 106 for the driver. The driver doesn't enforce this limit though. -Three different fan control modes are supported: +Three different fan control modes are supported; the mode number is written +to the pwm<n>_enable file. -* Manual mode - You ask for a specific PWM duty cycle or DC voltage. +* 1: Manual mode + You ask for a specific PWM duty cycle or DC voltage by writing to the + pwm<n> file. -* Fan speed mode - You ask for a specific fan speed. This mode assumes that pwm1 - corresponds to fan1, pwm2 to fan2 and pwm3 to fan3. +* 2: Temperature mode + You define 3 temperature/fan speed trip points using the + pwm<n>_auto_point<m>_temp and _fan files. These define a staircase + relationship between temperature and fan speed with two additional points + interpolated between the values that you define. When the temperature + is below auto_point1_temp the fan is switched off. -* Temperature mode - You define 3 temperature/fan speed trip points, and the fan speed is - adjusted depending on the measured temperature, using interpolation. - This mode is not yet supported by the driver. +* 3: Fan speed mode + You ask for a specific fan speed by writing to the fan<n>_target file. + +Both of the automatic modes require that pwm1 corresponds to fan1, pwm2 to +fan2 and pwm3 to fan3. Temperature mode also requires that temp1 corresponds +to pwm1 and fan1, etc. diff --git a/Documentation/hwmon/it87 b/Documentation/hwmon/it87 index c0528d6f9ace..81ecc7e41c50 100644 --- a/Documentation/hwmon/it87 +++ b/Documentation/hwmon/it87 @@ -12,11 +12,12 @@ Supported chips: Addresses scanned: from Super I/O config space (8 I/O ports) Datasheet: Publicly available at the ITE website http://www.ite.com.tw/ - * IT8716F + * IT8716F/IT8726F Prefix: 'it8716' Addresses scanned: from Super I/O config space (8 I/O ports) Datasheet: Publicly available at the ITE website http://www.ite.com.tw/product_info/file/pc/IT8716F_V0.3.ZIP + http://www.ite.com.tw/product_info/file/pc/IT8726F_V0.3.pdf * IT8718F Prefix: 'it8718' Addresses scanned: from Super I/O config space (8 I/O ports) @@ -68,7 +69,7 @@ Description ----------- This driver implements support for the IT8705F, IT8712F, IT8716F, -IT8718F and SiS950 chips. +IT8718F, IT8726F and SiS950 chips. These chips are 'Super I/O chips', supporting floppy disks, infrared ports, joysticks and other miscellaneous stuff. For hardware monitoring, they @@ -97,6 +98,10 @@ clock divider mess) but not compatible with the older chips and revisions. For now, the driver only uses the 16-bit mode on the IT8716F and IT8718F. +The IT8726F is just bit enhanced IT8716F with additional hardware +for AMD power sequencing. Therefore the chip will appear as IT8716F +to userspace applications. + Temperatures are measured in degrees Celsius. An alarm is triggered once when the Overtemperature Shutdown limit is crossed. diff --git a/Documentation/hwmon/lm90 b/Documentation/hwmon/lm90 index 438cb24cee5b..aa4a0ec20081 100644 --- a/Documentation/hwmon/lm90 +++ b/Documentation/hwmon/lm90 @@ -48,6 +48,18 @@ Supported chips: Addresses scanned: I2C 0x4c, 0x4d (unsupported 0x4e) Datasheet: Publicly available at the Maxim website http://www.maxim-ic.com/quick_view2.cfm/qv_pk/2578 + * Maxim MAX6680 + Prefix: 'max6680' + Addresses scanned: I2C 0x18, 0x19, 0x1a, 0x29, 0x2a, 0x2b, + 0x4c, 0x4d and 0x4e + Datasheet: Publicly available at the Maxim website + http://www.maxim-ic.com/quick_view2.cfm/qv_pk/3370 + * Maxim MAX6681 + Prefix: 'max6680' + Addresses scanned: I2C 0x18, 0x19, 0x1a, 0x29, 0x2a, 0x2b, + 0x4c, 0x4d and 0x4e + Datasheet: Publicly available at the Maxim website + http://www.maxim-ic.com/quick_view2.cfm/qv_pk/3370 Author: Jean Delvare <khali@linux-fr.org> @@ -59,11 +71,15 @@ Description The LM90 is a digital temperature sensor. It senses its own temperature as well as the temperature of up to one external diode. It is compatible with many other devices such as the LM86, the LM89, the LM99, the ADM1032, -the MAX6657, MAX6658 and the MAX6659 all of which are supported by this driver. -Note that there is no easy way to differentiate between the last three -variants. The extra address and features of the MAX6659 are not supported by -this driver. Additionally, the ADT7461 is supported if found in ADM1032 -compatibility mode. +the MAX6657, MAX6658, MAX6659, MAX6680 and the MAX6681 all of which are +supported by this driver. + +Note that there is no easy way to differentiate between the MAX6657, +MAX6658 and MAX6659 variants. The extra address and features of the +MAX6659 are not supported by this driver. The MAX6680 and MAX6681 only +differ in their pinout, therefore they obviously can't (and don't need to) +be distinguished. Additionally, the ADT7461 is supported if found in +ADM1032 compatibility mode. The specificity of this family of chipsets over the ADM1021/LM84 family is that it features critical limits with hysteresis, and an @@ -93,18 +109,22 @@ ADM1032: * ALERT is triggered by open remote sensor. * SMBus PEC support for Write Byte and Receive Byte transactions. -ADT7461 +ADT7461: * Extended temperature range (breaks compatibility) * Lower resolution for remote temperature MAX6657 and MAX6658: * Remote sensor type selection -MAX6659 +MAX6659: * Selectable address * Second critical temperature limit * Remote sensor type selection +MAX6680 and MAX6681: + * Selectable address + * Remote sensor type selection + All temperature values are given in degrees Celsius. Resolution is 1.0 degree for the local temperature, 0.125 degree for the remote temperature. @@ -141,7 +161,7 @@ SMBus Read Byte, and PEC will work properly. Additionally, the ADM1032 doesn't support SMBus Send Byte with PEC. Instead, it will try to write the PEC value to the register (because the SMBus Send Byte transaction with PEC is similar to a Write Byte transaction -without PEC), which is not what we want. Thus, PEC is explicitely disabled +without PEC), which is not what we want. Thus, PEC is explicitly disabled on SMBus Send Byte transactions in the lm90 driver. PEC on byte data transactions represents a significant increase in bandwidth diff --git a/Documentation/hwmon/lm93 b/Documentation/hwmon/lm93 new file mode 100644 index 000000000000..4e4a1dc1d2da --- /dev/null +++ b/Documentation/hwmon/lm93 @@ -0,0 +1,412 @@ +Kernel driver lm93 +================== + +Supported chips: + * National Semiconductor LM93 + Prefix 'lm93' + Addresses scanned: I2C 0x2c-0x2e + Datasheet: http://www.national.com/ds.cgi/LM/LM93.pdf + +Author: + Mark M. Hoffman <mhoffman@lightlink.com> + Ported to 2.6 by Eric J. Bowersox <ericb@aspsys.com> + Adapted to 2.6.20 by Carsten Emde <ce@osadl.org> + Modified for mainline integration by Hans J. Koch <hjk@linutronix.de> + +Module Parameters +----------------- + +(specific to LM93) +* init: integer + Set to non-zero to force some initializations (default is 0). +* disable_block: integer + A "0" allows SMBus block data transactions if the host supports them. A "1" + disables SMBus block data transactions. The default is 0. +* vccp_limit_type: integer array (2) + Configures in7 and in8 limit type, where 0 means absolute and non-zero + means relative. "Relative" here refers to "Dynamic Vccp Monitoring using + VID" from the datasheet. It greatly simplifies the interface to allow + only one set of limits (absolute or relative) to be in operation at a + time (even though the hardware is capable of enabling both). There's + not a compelling use case for enabling both at once, anyway. The default + is "0,0". +* vid_agtl: integer + A "0" configures the VID pins for V(ih) = 2.1V min, V(il) = 0.8V max. + A "1" configures the VID pins for V(ih) = 0.8V min, V(il) = 0.4V max. + (The latter setting is referred to as AGTL+ Compatible in the datasheet.) + I.e. this parameter controls the VID pin input thresholds; if your VID + inputs are not working, try changing this. The default value is "0". + +(common among sensor drivers) +* force: short array (min = 1, max = 48) + List of adapter,address pairs to assume to be present. Autodetection + of the target device will still be attempted. Use one of the more + specific force directives below if this doesn't detect the device. +* force_lm93: short array (min = 1, max = 48) + List of adapter,address pairs which are unquestionably assumed to contain + a 'lm93' chip +* ignore: short array (min = 1, max = 48) + List of adapter,address pairs not to scan +* ignore_range: short array (min = 1, max = 48) + List of adapter,start-addr,end-addr triples not to scan +* probe: short array (min = 1, max = 48) + List of adapter,address pairs to scan additionally +* probe_range: short array (min = 1, max = 48) + List of adapter,start-addr,end-addr triples to scan additionally + + +Hardware Description +-------------------- + +(from the datasheet) + +The LM93, hardware monitor, has a two wire digital interface compatible with +SMBus 2.0. Using an 8-bit ADC, the LM93 measures the temperature of two remote +diode connected transistors as well as its own die and 16 power supply +voltages. To set fan speed, the LM93 has two PWM outputs that are each +controlled by up to four temperature zones. The fancontrol algorithm is lookup +table based. The LM93 includes a digital filter that can be invoked to smooth +temperature readings for better control of fan speed. The LM93 has four +tachometer inputs to measure fan speed. Limit and status registers for all +measured values are included. The LM93 builds upon the functionality of +previous motherboard management ASICs and uses some of the LM85 s features +(i.e. smart tachometer mode). It also adds measurement and control support +for dynamic Vccp monitoring and PROCHOT. It is designed to monitor a dual +processor Xeon class motherboard with a minimum of external components. + + +Driver Description +------------------ + +This driver implements support for the National Semiconductor LM93. + + +User Interface +-------------- + +#PROCHOT: + +The LM93 can monitor two #PROCHOT signals. The results are found in the +sysfs files prochot1, prochot2, prochot1_avg, prochot2_avg, prochot1_max, +and prochot2_max. prochot1_max and prochot2_max contain the user limits +for #PROCHOT1 and #PROCHOT2, respectively. prochot1 and prochot2 contain +the current readings for the most recent complete time interval. The +value of prochot1_avg and prochot2_avg is something like a 2 period +exponential moving average (but not quite - check the datasheet). Note +that this third value is calculated by the chip itself. All values range +from 0-255 where 0 indicates no throttling, and 255 indicates > 99.6%. + +The monitoring intervals for the two #PROCHOT signals is also configurable. +These intervals can be found in the sysfs files prochot1_interval and +prochot2_interval. The values in these files specify the intervals for +#P1_PROCHOT and #P2_PROCHOT, respectively. Selecting a value not in this +list will cause the driver to use the next largest interval. The available +intervals are: + +#PROCHOT intervals: 0.73, 1.46, 2.9, 5.8, 11.7, 23.3, 46.6, 93.2, 186, 372 + +It is possible to configure the LM93 to logically short the two #PROCHOT +signals. I.e. when #P1_PROCHOT is asserted, the LM93 will automatically +assert #P2_PROCHOT, and vice-versa. This mode is enabled by writing a +non-zero integer to the sysfs file prochot_short. + +The LM93 can also override the #PROCHOT pins by driving a PWM signal onto +one or both of them. When overridden, the signal has a period of 3.56 mS, +a minimum pulse width of 5 clocks (at 22.5kHz => 6.25% duty cycle), and +a maximum pulse width of 80 clocks (at 22.5kHz => 99.88% duty cycle). + +The sysfs files prochot1_override and prochot2_override contain boolean +intgers which enable or disable the override function for #P1_PROCHOT and +#P2_PROCHOT, respectively. The sysfs file prochot_override_duty_cycle +contains a value controlling the duty cycle for the PWM signal used when +the override function is enabled. This value ranges from 0 to 15, with 0 +indicating minimum duty cycle and 15 indicating maximum. + +#VRD_HOT: + +The LM93 can monitor two #VRD_HOT signals. The results are found in the +sysfs files vrdhot1 and vrdhot2. There is one value per file: a boolean for +which 1 indicates #VRD_HOT is asserted and 0 indicates it is negated. These +files are read-only. + +Smart Tach Mode: + +(from the datasheet) + + If a fan is driven using a low-side drive PWM, the tachometer + output of the fan is corrupted. The LM93 includes smart tachometer + circuitry that allows an accurate tachometer reading to be + achieved despite the signal corruption. In smart tach mode all + four signals are measured within 4 seconds. + +Smart tach mode is enabled by the driver by writing 1 or 2 (associating the +the fan tachometer with a pwm) to the sysfs file fan<n>_smart_tach. A zero +will disable the function for that fan. Note that Smart tach mode cannot be +enabled if the PWM output frequency is 22500 Hz (see below). + +Manual PWM: + +The LM93 has a fixed or override mode for the two PWM outputs (although, there +are still some conditions that will override even this mode - see section +15.10.6 of the datasheet for details.) The sysfs files pwm1_override +and pwm2_override are used to enable this mode; each is a boolean integer +where 0 disables and 1 enables the manual control mode. The sysfs files pwm1 +and pwm2 are used to set the manual duty cycle; each is an integer (0-255) +where 0 is 0% duty cycle, and 255 is 100%. Note that the duty cycle values +are constrained by the hardware. Selecting a value which is not available +will cause the driver to use the next largest value. Also note: when manual +PWM mode is disabled, the value of pwm1 and pwm2 indicates the current duty +cycle chosen by the h/w. + +PWM Output Frequency: + +The LM93 supports several different frequencies for the PWM output channels. +The sysfs files pwm1_freq and pwm2_freq are used to select the frequency. The +frequency values are constrained by the hardware. Selecting a value which is +not available will cause the driver to use the next largest value. Also note +that this parameter has implications for the Smart Tach Mode (see above). + +PWM Output Frequencies: 12, 36, 48, 60, 72, 84, 96, 22500 (h/w default) + +Automatic PWM: + +The LM93 is capable of complex automatic fan control, with many different +points of configuration. To start, each PWM output can be bound to any +combination of eight control sources. The final PWM is the largest of all +individual control sources to which the PWM output is bound. + +The eight control sources are: temp1-temp4 (aka "zones" in the datasheet), +#PROCHOT 1 & 2, and #VRDHOT 1 & 2. The bindings are expressed as a bitmask +in the sysfs files pwm<n>_auto_channels, where a "1" enables the binding, and + a "0" disables it. The h/w default is 0x0f (all temperatures bound). + + 0x01 - Temp 1 + 0x02 - Temp 2 + 0x04 - Temp 3 + 0x08 - Temp 4 + 0x10 - #PROCHOT 1 + 0x20 - #PROCHOT 2 + 0x40 - #VRDHOT 1 + 0x80 - #VRDHOT 2 + +The function y = f(x) takes a source temperature x to a PWM output y. This +function of the LM93 is derived from a base temperature and a table of 12 +temperature offsets. The base temperature is expressed in degrees C in the +sysfs files temp<n>_auto_base. The offsets are expressed in cumulative +degrees C, with the value of offset <i> for temperature value <n> being +contained in the file temp<n>_auto_offset<i>. E.g. if the base temperature +is 40C: + + offset # temp<n>_auto_offset<i> range pwm + 1 0 - 25.00% + 2 0 - 28.57% + 3 1 40C - 41C 32.14% + 4 1 41C - 42C 35.71% + 5 2 42C - 44C 39.29% + 6 2 44C - 46C 42.86% + 7 2 48C - 50C 46.43% + 8 2 50C - 52C 50.00% + 9 2 52C - 54C 53.57% + 10 2 54C - 56C 57.14% + 11 2 56C - 58C 71.43% + 12 2 58C - 60C 85.71% + > 60C 100.00% + +Valid offsets are in the range 0C <= x <= 7.5C in 0.5C increments. + +There is an independent base temperature for each temperature channel. Note, +however, there are only two tables of offsets: one each for temp[12] and +temp[34]. Therefore, any change to e.g. temp1_auto_offset<i> will also +affect temp2_auto_offset<i>. + +The LM93 can also apply hysteresis to the offset table, to prevent unwanted +oscillation between two steps in the offsets table. These values are found in +the sysfs files temp<n>_auto_offset_hyst. The value in this file has the +same representation as in temp<n>_auto_offset<i>. + +If a temperature reading falls below the base value for that channel, the LM93 +will use the minimum PWM value. These values are found in the sysfs files +temp<n>_auto_pwm_min. Note, there are only two minimums: one each for temp[12] +and temp[34]. Therefore, any change to e.g. temp1_auto_pwm_min will also +affect temp2_auto_pwm_min. + +PWM Spin-Up Cycle: + +A spin-up cycle occurs when a PWM output is commanded from 0% duty cycle to +some value > 0%. The LM93 supports a minimum duty cycle during spin-up. These +values are found in the sysfs files pwm<n>_auto_spinup_min. The value in this +file has the same representation as other PWM duty cycle values. The +duration of the spin-up cycle is also configurable. These values are found in +the sysfs files pwm<n>_auto_spinup_time. The value in this file is +the spin-up time in seconds. The available spin-up times are constrained by +the hardware. Selecting a value which is not available will cause the driver +to use the next largest value. + +Spin-up Durations: 0 (disabled, h/w default), 0.1, 0.25, 0.4, 0.7, 1.0, + 2.0, 4.0 + +#PROCHOT and #VRDHOT PWM Ramping: + +If the #PROCHOT or #VRDHOT signals are asserted while bound to a PWM output +channel, the LM93 will ramp the PWM output up to 100% duty cycle in discrete +steps. The duration of each step is configurable. There are two files, with +one value each in seconds: pwm_auto_prochot_ramp and pwm_auto_vrdhot_ramp. +The available ramp times are constrained by the hardware. Selecting a value +which is not available will cause the driver to use the next largest value. + +Ramp Times: 0 (disabled, h/w default) to 0.75 in 0.05 second intervals + +Fan Boost: + +For each temperature channel, there is a boost temperature: if the channel +exceeds this limit, the LM93 will immediately drive both PWM outputs to 100%. +This limit is expressed in degrees C in the sysfs files temp<n>_auto_boost. +There is also a hysteresis temperature for this function: after the boost +limit is reached, the temperature channel must drop below this value before +the boost function is disabled. This temperature is also expressed in degrees +C in the sysfs files temp<n>_auto_boost_hyst. + +GPIO Pins: + +The LM93 can monitor the logic level of four dedicated GPIO pins as well as the +four tach input pins. GPIO0-GPIO3 correspond to (fan) tach 1-4, respectively. +All eight GPIOs are read by reading the bitmask in the sysfs file gpio. The +LSB is GPIO0, and the MSB is GPIO7. + + +LM93 Unique sysfs Files +----------------------- + + file description + ------------------------------------------------------------- + + prochot<n> current #PROCHOT % + + prochot<n>_avg moving average #PROCHOT % + + prochot<n>_max limit #PROCHOT % + + prochot_short enable or disable logical #PROCHOT pin short + + prochot<n>_override force #PROCHOT assertion as PWM + + prochot_override_duty_cycle + duty cycle for the PWM signal used when + #PROCHOT is overridden + + prochot<n>_interval #PROCHOT PWM sampling interval + + vrdhot<n> 0 means negated, 1 means asserted + + fan<n>_smart_tach enable or disable smart tach mode + + pwm<n>_auto_channels select control sources for PWM outputs + + pwm<n>_auto_spinup_min minimum duty cycle during spin-up + + pwm<n>_auto_spinup_time duration of spin-up + + pwm_auto_prochot_ramp ramp time per step when #PROCHOT asserted + + pwm_auto_vrdhot_ramp ramp time per step when #VRDHOT asserted + + temp<n>_auto_base temperature channel base + + temp<n>_auto_offset[1-12] + temperature channel offsets + + temp<n>_auto_offset_hyst + temperature channel offset hysteresis + + temp<n>_auto_boost temperature channel boost (PWMs to 100%) limit + + temp<n>_auto_boost_hyst temperature channel boost hysteresis + + gpio input state of 8 GPIO pins; read-only + + +Sample Configuration File +------------------------- + +Here is a sample LM93 chip config for sensors.conf: + +---------- cut here ---------- +chip "lm93-*" + +# VOLTAGE INPUTS + + # labels and scaling based on datasheet recommendations + label in1 "+12V1" + compute in1 @ * 12.945, @ / 12.945 + set in1_min 12 * 0.90 + set in1_max 12 * 1.10 + + label in2 "+12V2" + compute in2 @ * 12.945, @ / 12.945 + set in2_min 12 * 0.90 + set in2_max 12 * 1.10 + + label in3 "+12V3" + compute in3 @ * 12.945, @ / 12.945 + set in3_min 12 * 0.90 + set in3_max 12 * 1.10 + + label in4 "FSB_Vtt" + + label in5 "3GIO" + + label in6 "ICH_Core" + + label in7 "Vccp1" + + label in8 "Vccp2" + + label in9 "+3.3V" + set in9_min 3.3 * 0.90 + set in9_max 3.3 * 1.10 + + label in10 "+5V" + set in10_min 5.0 * 0.90 + set in10_max 5.0 * 1.10 + + label in11 "SCSI_Core" + + label in12 "Mem_Core" + + label in13 "Mem_Vtt" + + label in14 "Gbit_Core" + + # Assuming R1/R2 = 4.1143, and 3.3V reference + # -12V = (4.1143 + 1) * (@ - 3.3) + 3.3 + label in15 "-12V" + compute in15 @ * 5.1143 - 13.57719, (@ + 13.57719) / 5.1143 + set in15_min -12 * 0.90 + set in15_max -12 * 1.10 + + label in16 "+3.3VSB" + set in16_min 3.3 * 0.90 + set in16_max 3.3 * 1.10 + +# TEMPERATURE INPUTS + + label temp1 "CPU1" + label temp2 "CPU2" + label temp3 "LM93" + +# TACHOMETER INPUTS + + label fan1 "Fan1" + set fan1_min 3000 + label fan2 "Fan2" + set fan2_min 3000 + label fan3 "Fan3" + set fan3_min 3000 + label fan4 "Fan4" + set fan4_min 3000 + +# PWM OUTPUTS + + label pwm1 "CPU1" + label pwm2 "CPU2" + diff --git a/Documentation/hwmon/smsc47b397 b/Documentation/hwmon/smsc47b397 index 20682f15ae41..3a43b6948924 100644 --- a/Documentation/hwmon/smsc47b397 +++ b/Documentation/hwmon/smsc47b397 @@ -4,6 +4,7 @@ Kernel driver smsc47b397 Supported chips: * SMSC LPC47B397-NC * SMSC SCH5307-NS + * SMSC SCH5317 Prefix: 'smsc47b397' Addresses scanned: none, address read from Super I/O config space Datasheet: In this file @@ -18,8 +19,8 @@ The following specification describes the SMSC LPC47B397-NC[1] sensor chip provided by Craig Kelly (In-Store Broadcast Network) and edited/corrected by Mark M. Hoffman <mhoffman@lightlink.com>. -[1] And SMSC SCH5307-NS, which has a different device ID but is otherwise -compatible. +[1] And SMSC SCH5307-NS and SCH5317, which have different device IDs but are +otherwise compatible. * * * * * @@ -131,7 +132,7 @@ OUT DX,AL The registers of interest for identifying the SIO on the dc7100 are Device ID (0x20) and Device Rev (0x21). -The Device ID will read 0x6F (for SCH5307-NS, 0x81) +The Device ID will read 0x6F (0x81 for SCH5307-NS, and 0x85 for SCH5317) The Device Rev currently reads 0x01 Obtaining the HWM Base Address. diff --git a/Documentation/hwmon/sysfs-interface b/Documentation/hwmon/sysfs-interface index a9a18ad0d17a..b3a9e1b9dbda 100644 --- a/Documentation/hwmon/sysfs-interface +++ b/Documentation/hwmon/sysfs-interface @@ -172,11 +172,10 @@ pwm[1-*] Pulse width modulation fan control. 255 is max or 100%. pwm[1-*]_enable - Switch PWM on and off. - Not always present even if pwmN is. - 0: turn off - 1: turn on in manual mode - 2+: turn on in automatic mode + Fan speed control method: + 0: no fan speed control (i.e. fan at full speed) + 1: manual fan speed control enabled (using pwm[1-*]) + 2+: automatic fan speed control enabled Check individual chip documentation files for automatic mode details. RW @@ -343,9 +342,9 @@ to notify open diodes, unconnected fans etc. where the hardware supports it. When this boolean has value 1, the measurement for that channel should not be trusted. -in[0-*]_input_fault -fan[1-*]_input_fault -temp[1-*]_input_fault +in[0-*]_fault +fan[1-*]_fault +temp[1-*]_fault Input fault condition 0: no fault occured 1: fault condition diff --git a/Documentation/hwmon/w83627ehf b/Documentation/hwmon/w83627ehf index 030fac6cec7a..ccc2bcb61068 100644 --- a/Documentation/hwmon/w83627ehf +++ b/Documentation/hwmon/w83627ehf @@ -22,9 +22,9 @@ This driver implements support for the Winbond W83627EHF, W83627EHG, and W83627DHG super I/O chips. We will refer to them collectively as Winbond chips. The chips implement three temperature sensors, five fan rotation -speed sensors, ten analog voltage sensors (only nine for the 627DHG), alarms -with beep warnings (control unimplemented), and some automatic fan regulation -strategies (plus manual fan control mode). +speed sensors, ten analog voltage sensors (only nine for the 627DHG), one +VID (6 pins), alarms with beep warnings (control unimplemented), and +some automatic fan regulation strategies (plus manual fan control mode). Temperatures are measured in degrees Celsius and measurement resolution is 1 degC for temp1 and 0.5 degC for temp2 and temp3. An alarm is triggered when diff --git a/Documentation/i2c/busses/i2c-i801 b/Documentation/i2c/busses/i2c-i801 index c34f0db78a30..fe6406f2f9a6 100644 --- a/Documentation/i2c/busses/i2c-i801 +++ b/Documentation/i2c/busses/i2c-i801 @@ -5,8 +5,8 @@ Supported adapters: '810' and '810E' chipsets) * Intel 82801BA (ICH2 - part of the '815E' chipset) * Intel 82801CA/CAM (ICH3) - * Intel 82801DB (ICH4) (HW PEC supported, 32 byte buffer not supported) - * Intel 82801EB/ER (ICH5) (HW PEC supported, 32 byte buffer not supported) + * Intel 82801DB (ICH4) (HW PEC supported) + * Intel 82801EB/ER (ICH5) (HW PEC supported) * Intel 6300ESB * Intel 82801FB/FR/FW/FRW (ICH6) * Intel 82801G (ICH7) diff --git a/Documentation/i2c/busses/i2c-piix4 b/Documentation/i2c/busses/i2c-piix4 index 7cbe43fa2701..fa0c786a8bf5 100644 --- a/Documentation/i2c/busses/i2c-piix4 +++ b/Documentation/i2c/busses/i2c-piix4 @@ -6,7 +6,7 @@ Supported adapters: Datasheet: Publicly available at the Intel website * ServerWorks OSB4, CSB5, CSB6 and HT-1000 southbridges Datasheet: Only available via NDA from ServerWorks - * ATI IXP200, IXP300, IXP400 and SB600 southbridges + * ATI IXP200, IXP300, IXP400, SB600 and SB700 southbridges Datasheet: Not publicly available * Standard Microsystems (SMSC) SLC90E66 (Victory66) southbridge Datasheet: Publicly available at the SMSC website http://www.smsc.com diff --git a/Documentation/i2c/busses/i2c-taos-evm b/Documentation/i2c/busses/i2c-taos-evm new file mode 100644 index 000000000000..9146e33be6dd --- /dev/null +++ b/Documentation/i2c/busses/i2c-taos-evm @@ -0,0 +1,46 @@ +Kernel driver i2c-taos-evm + +Author: Jean Delvare <khali@linux-fr.org> + +This is a driver for the evaluation modules for TAOS I2C/SMBus chips. +The modules include an SMBus master with limited capabilities, which can +be controlled over the serial port. Virtually all evaluation modules +are supported, but a few lines of code need to be added for each new +module to instantiate the right I2C chip on the bus. Obviously, a driver +for the chip in question is also needed. + +Currently supported devices are: + +* TAOS TSL2550 EVM + +For addtional information on TAOS products, please see + http://www.taosinc.com/ + + +Using this driver +----------------- + +In order to use this driver, you'll need the serport driver, and the +inputattach tool, which is part of the input-utils package. The following +commands will tell the kernel that you have a TAOS EVM on the first +serial port: + +# modprobe serport +# inputattach --taos-evm /dev/ttyS0 + + +Technical details +----------------- + +Only 4 SMBus transaction types are supported by the TAOS evaluation +modules: +* Receive Byte +* Send Byte +* Read Byte +* Write Byte + +The communication protocol is text-based and pretty simple. It is +described in a PDF document on the CD which comes with the evaluation +module. The communication is rather slow, because the serial port has +to operate at 1200 bps. However, I don't think this is a big concern in +practice, as these modules are meant for evaluation and testing only. diff --git a/Documentation/i2c/chips/max6875 b/Documentation/i2c/chips/max6875 index 96fec562a8e9..a0cd8af2f408 100644 --- a/Documentation/i2c/chips/max6875 +++ b/Documentation/i2c/chips/max6875 @@ -99,7 +99,7 @@ And then read the data or - count = i2c_smbus_read_i2c_block_data(fd, 0x84, buffer); + count = i2c_smbus_read_i2c_block_data(fd, 0x84, 16, buffer); The block read should read 16 bytes. 0x84 is the block read command. diff --git a/Documentation/i2c/chips/x1205 b/Documentation/i2c/chips/x1205 deleted file mode 100644 index 09407c991fe5..000000000000 --- a/Documentation/i2c/chips/x1205 +++ /dev/null @@ -1,38 +0,0 @@ -Kernel driver x1205 -=================== - -Supported chips: - * Xicor X1205 RTC - Prefix: 'x1205' - Addresses scanned: none - Datasheet: http://www.intersil.com/cda/deviceinfo/0,1477,X1205,00.html - -Authors: - Karen Spearel <kas11@tampabay.rr.com>, - Alessandro Zummo <a.zummo@towertech.it> - -Description ------------ - -This module aims to provide complete access to the Xicor X1205 RTC. -Recently Xicor has merged with Intersil, but the chip is -still sold under the Xicor brand. - -This chip is located at address 0x6f and uses a 2-byte register addressing. -Two bytes need to be written to read a single register, while most -other chips just require one and take the second one as the data -to be written. To prevent corrupting unknown chips, the user must -explicitely set the probe parameter. - -example: - -modprobe x1205 probe=0,0x6f - -The module supports one more option, hctosys, which is used to set the -software clock from the x1205. On systems where the x1205 is the -only hardware rtc, this parameter could be used to achieve a correct -date/time earlier in the system boot sequence. - -example: - -modprobe x1205 probe=0,0x6f hctosys=1 diff --git a/Documentation/i2c/summary b/Documentation/i2c/summary index aea60bf7e8f0..003c7319b8c7 100644 --- a/Documentation/i2c/summary +++ b/Documentation/i2c/summary @@ -67,7 +67,6 @@ i2c-proc: The /proc/sys/dev/sensors interface for device (client) drivers Algorithm drivers ----------------- -i2c-algo-8xx: An algorithm for CPM's I2C device in Motorola 8xx processors (NOT BUILT BY DEFAULT) i2c-algo-bit: A bit-banging algorithm i2c-algo-pcf: A PCF 8584 style algorithm i2c-algo-ibm_ocp: An algorithm for the I2C device in IBM 4xx processors (NOT BUILT BY DEFAULT) @@ -81,6 +80,5 @@ i2c-pcf-epp: PCF8584 on a EPP parallel port (uses i2c-algo-pcf) (NOT mkpatch i2c-philips-par: Philips style parallel port adapter (uses i2c-algo-bit) i2c-adap-ibm_ocp: IBM 4xx processor I2C device (uses i2c-algo-ibm_ocp) (NOT BUILT BY DEFAULT) i2c-pport: Primitive parallel port adapter (uses i2c-algo-bit) -i2c-rpx: RPX board Motorola 8xx I2C device (uses i2c-algo-8xx) (NOT BUILT BY DEFAULT) i2c-velleman: Velleman K8000 parallel port adapter (uses i2c-algo-bit) diff --git a/Documentation/i2c/writing-clients b/Documentation/i2c/writing-clients index 3d8d36b0ad12..2c170032bf37 100644 --- a/Documentation/i2c/writing-clients +++ b/Documentation/i2c/writing-clients @@ -571,7 +571,7 @@ SMBus communication u8 command, u8 length, u8 *values); extern s32 i2c_smbus_read_i2c_block_data(struct i2c_client * client, - u8 command, u8 *values); + u8 command, u8 length, u8 *values); These ones were removed in Linux 2.6.10 because they had no users, but could be added back later if needed: diff --git a/Documentation/i386/zero-page.txt b/Documentation/i386/zero-page.txt index c04a421f4a7c..75b3680c41eb 100644 --- a/Documentation/i386/zero-page.txt +++ b/Documentation/i386/zero-page.txt @@ -37,6 +37,7 @@ Offset Type Description 0x1d0 unsigned long EFI memory descriptor map pointer 0x1d4 unsigned long EFI memory descriptor map size 0x1e0 unsigned long ALT_MEM_K, alternative mem check, in Kb +0x1e4 unsigned long Scratch field for the kernel setup code 0x1e8 char number of entries in E820MAP (below) 0x1e9 unsigned char number of entries in EDDBUF (below) 0x1ea unsigned char number of entries in EDD_MBR_SIG_BUFFER (below) diff --git a/Documentation/ia64/aliasing-test.c b/Documentation/ia64/aliasing-test.c index d485256ee1ce..773a814d4093 100644 --- a/Documentation/ia64/aliasing-test.c +++ b/Documentation/ia64/aliasing-test.c @@ -19,6 +19,7 @@ #include <sys/mman.h> #include <sys/stat.h> #include <unistd.h> +#include <linux/pci.h> int sum; @@ -34,13 +35,19 @@ int map_mem(char *path, off_t offset, size_t length, int touch) return -1; } + if (fnmatch("/proc/bus/pci/*", path, 0) == 0) { + rc = ioctl(fd, PCIIOC_MMAP_IS_MEM); + if (rc == -1) + perror("PCIIOC_MMAP_IS_MEM ioctl"); + } + addr = mmap(NULL, length, PROT_READ|PROT_WRITE, MAP_SHARED, fd, offset); if (addr == MAP_FAILED) return 1; if (touch) { c = (int *) addr; - while (c < (int *) (offset + length)) + while (c < (int *) (addr + length)) sum += *c++; } @@ -54,7 +61,7 @@ int map_mem(char *path, off_t offset, size_t length, int touch) return 0; } -int scan_sysfs(char *path, char *file, off_t offset, size_t length, int touch) +int scan_tree(char *path, char *file, off_t offset, size_t length, int touch) { struct dirent **namelist; char *name, *path2; @@ -93,7 +100,7 @@ int scan_sysfs(char *path, char *file, off_t offset, size_t length, int touch) } else { r = lstat(path2, &buf); if (r == 0 && S_ISDIR(buf.st_mode)) { - rc = scan_sysfs(path2, file, offset, length, touch); + rc = scan_tree(path2, file, offset, length, touch); if (rc < 0) return rc; } @@ -238,10 +245,15 @@ int main() else fprintf(stderr, "FAIL: /dev/mem 0x0-0x100000 not accessible\n"); - scan_sysfs("/sys/class/pci_bus", "legacy_mem", 0, 0xA0000, 1); - scan_sysfs("/sys/class/pci_bus", "legacy_mem", 0xA0000, 0x20000, 0); - scan_sysfs("/sys/class/pci_bus", "legacy_mem", 0xC0000, 0x40000, 1); - scan_sysfs("/sys/class/pci_bus", "legacy_mem", 0, 1024*1024, 0); + scan_tree("/sys/class/pci_bus", "legacy_mem", 0, 0xA0000, 1); + scan_tree("/sys/class/pci_bus", "legacy_mem", 0xA0000, 0x20000, 0); + scan_tree("/sys/class/pci_bus", "legacy_mem", 0xC0000, 0x40000, 1); + scan_tree("/sys/class/pci_bus", "legacy_mem", 0, 1024*1024, 0); scan_rom("/sys/devices", "rom"); + + scan_tree("/proc/bus/pci", "??.?", 0, 0xA0000, 1); + scan_tree("/proc/bus/pci", "??.?", 0xA0000, 0x20000, 0); + scan_tree("/proc/bus/pci", "??.?", 0xC0000, 0x40000, 1); + scan_tree("/proc/bus/pci", "??.?", 0, 1024*1024, 0); } diff --git a/Documentation/ia64/aliasing.txt b/Documentation/ia64/aliasing.txt index 9a431a7d0f5d..aa3e953f0f7b 100644 --- a/Documentation/ia64/aliasing.txt +++ b/Documentation/ia64/aliasing.txt @@ -112,6 +112,18 @@ POTENTIAL ATTRIBUTE ALIASING CASES The /dev/mem mmap constraints apply. + mmap of /proc/bus/pci/.../??.? + + This is an MMIO mmap of PCI functions, which additionally may or + may not be requested as using the WC attribute. + + If WC is requested, and the region in kern_memmap is either WC + or UC, and the EFI memory map designates the region as WC, then + the WC mapping is allowed. + + Otherwise, the user mapping must use the same attribute as the + kernel mapping. + read/write of /dev/mem This uses copy_from_user(), which implicitly uses a kernel diff --git a/Documentation/ioctl-number.txt b/Documentation/ioctl-number.txt index 3de7d379cf07..5c7fbf9d96b4 100644 --- a/Documentation/ioctl-number.txt +++ b/Documentation/ioctl-number.txt @@ -67,7 +67,7 @@ Code Seq# Include File Comments 0x00 00-1F linux/wavefront.h conflict! 0x02 all linux/fd.h 0x03 all linux/hdreg.h -0x04 all linux/umsdos_fs.h +0x04 D2-DC linux/umsdos_fs.h Dead since 2.6.11, but don't reuse these. 0x06 all linux/lp.h 0x09 all linux/md.h 0x12 all linux/fs.h diff --git a/Documentation/ja_JP/HOWTO b/Documentation/ja_JP/HOWTO new file mode 100644 index 000000000000..b2446a090870 --- /dev/null +++ b/Documentation/ja_JP/HOWTO @@ -0,0 +1,650 @@ +NOTE: +This is Japanese translated version of "Documentation/HOWTO". +This one is maintained by Tsugikazu Shibata <tshibata@ab.jp.nec.com> +and JF Project team <www.linux.or.jp/JF>. +If you find difference with original file or problem in translation, +please contact maintainer of this file or JF project. + +Please also note that purpose of this file is easier to read for non +English natives and not to be intended to fork. So, if you have any +comments or updates of this file, please try to update Original(English) +file at first. + +Last Updated: 2007/06/04 +================================== +これは、 +linux-2.6.21/Documentation/HOWTO +の和訳です。 + +翻訳団体: JF プロジェクト < http://www.linux.or.jp/JF/ > +翻訳日: 2007/06/04 +翻訳者: Tsugikazu Shibata <tshibata at ab dot jp dot nec dot com> +校正者: 松倉さん <nbh--mats at nifty dot com> + 小林 雅典さん (Masanori Kobayasi) <zap03216 at nifty dot ne dot jp> + 武井伸光さん、<takei at webmasters dot gr dot jp> + かねこさん (Seiji Kaneko) <skaneko at a2 dot mbn dot or dot jp> + 野口さん (Kenji Noguchi) <tokyo246 at gmail dot com> + 河内さん (Takayoshi Kochi) <t-kochi at bq dot jp dot nec dot com> + 岩本さん (iwamoto) <iwamoto.kn at ncos dot nec dot co dot jp> +================================== + +Linux カーネル開発のやり方 +------------------------------- + +これは上のトピック( Linux カーネル開発のやり方)の重要な事柄を網羅した +ドキュメントです。ここには Linux カーネル開発者になるための方法と +Linux カーネル開発コミュニティと共に活動するやり方を学ぶ方法が含まれて +います。カーネルプログラミングに関する技術的な項目に関することは何も含 +めないようにしていますが、カーネル開発者となるための正しい方向に向かう +手助けになります。 + +もし、このドキュメントのどこかが古くなっていた場合には、このドキュメン +トの最後にリストしたメンテナーにパッチを送ってください。 + +はじめに +--------- + +あなたは Linux カーネルの開発者になる方法を学びたいのでしょうか? そ +れともあなたは上司から「このデバイスの Linux ドライバを書くように」と +言われているのでしょうか? +この文書の目的は、あなたが踏むべき手順と、コミュニティと一緒にうまく働 +くヒントを書き下すことで、あなたが知るべき全てのことを教えることです。 +また、このコミュニティがなぜ今うまくまわっているのかという理由の一部も +説明しようと試みています。 + +カーネルは 少量のアーキテクチャ依存部分がアセンブリ言語で書かれている +以外は大部分は C 言語で書かれています。C言語をよく理解していることはカー +ネル開発者には必要です。アーキテクチャ向けの低レベル部分の開発をするの +でなければ、(どんなアーキテクチャでも)アセンブリ(訳注: 言語)は必要あり +ません。以下の本は、C 言語の十分な知識や何年もの経験に取って代わるもの +ではありませんが、少なくともリファレンスとしてはいい本です。 + - "The C Programming Language" by Kernighan and Ritchie [Prentice Hall] + -『プログラミング言語C第2版』(B.W. カーニハン/D.M. リッチー著 石田晴久訳) [共立出版] + - "Practical C Programming" by Steve Oualline [O'Reilly] + - 『C実践プログラミング第3版』(Steve Oualline著 望月康司監訳 谷口功訳) [オライリージャパン] + - "C: A Reference Manual" by Harbison and Steele [Prentice Hall] + - 『新・詳説 C 言語 H&S リファレンス』 + (サミュエル P ハービソン/ガイ L スティール共著 斉藤 信男監訳)[ソフトバンク] + +カーネルは GNU C と GNU ツールチェインを使って書かれています。カーネル +は ISO C89 仕様に準拠して書く一方で、標準には無い言語拡張を多く使って +います。カーネルは標準 C ライブラリとは関係がないといった、C 言語フリー +スタンディング環境です。そのため、C の標準で使えないものもあります。任 +意の long long の除算や浮動小数点は使えません。 +ときどき、カーネルがツールチェインや C 言語拡張に置いている前提がどう +なっているのかわかりにくいことがあり、また、残念なことに決定的なリファ +レンスは存在しません。情報を得るには、gcc の info ページ( info gcc )を +みてください。 + +あなたは既存の開発コミュニティと一緒に作業する方法を学ぼうとしているこ +とに留意してください。そのコミュニティは、コーディング、スタイル、 +開発手順について高度な標準を持つ、多様な人の集まりです。 +地理的に分散した大規模なチームに対してもっともうまくいくとわかったこと +をベースにしながら、これらの標準は長い時間をかけて築かれてきました。 +これらはきちんと文書化されていますから、事前にこれらの標準についてでき +るだけたくさん学んでください。また皆があなたやあなたの会社のやり方に合わ +せてくれると思わないでください。 + +法的問題 +------------ + +Linux カーネルのソースコードは GPL ライセンスの下でリリースされていま +す。ライセンスの詳細については、ソースツリーのメインディレクトリに存在 +する、COPYING のファイルをみてください。もしライセンスについてさらに質 +問があれば、Linux Kernel メーリングリストに質問するのではなく、どうぞ +法律家に相談してください。メーリングリストの人達は法律家ではなく、法的 +問題については彼らの声明はあてにするべきではありません。 + +GPL に関する共通の質問や回答については、以下を参照してください。 + http://www.gnu.org/licenses/gpl-faq.html + +ドキュメント +------------ + +Linux カーネルソースツリーは幅広い範囲のドキュメントを含んでおり、それ +らはカーネルコミュニティと会話する方法を学ぶのに非常に貴重なものです。 +新しい機能がカーネルに追加される場合、その機能の使い方について説明した +新しいドキュメントファイルも追加することを勧めます。 +カーネルの変更が、カーネルがユーザ空間に公開しているインターフェイスの +変更を引き起こす場合、その変更を説明するマニュアルページのパッチや情報 +をマニュアルページのメンテナ mtk-manpages@gmx.net に送ることを勧めます。 + +以下はカーネルソースツリーに含まれている読んでおくべきファイルの一覧で +す- + + README + このファイルは Linuxカーネルの簡単な背景とカーネルを設定(訳注 + configure )し、生成(訳注 build )するために必要なことは何かが書かれ + ています。カーネルに関して初めての人はここからスタートするとよいで + しょう。 + + Documentation/Changes + このファイルはカーネルをうまく生成(訳注 build )し、走らせるのに最 + 小限のレベルで必要な数々のソフトウェアパッケージの一覧を示してい + ます。 + + Documentation/CodingStyle + これは Linux カーネルのコーディングスタイルと背景にある理由を記述 + しています。全ての新しいコードはこのドキュメントにあるガイドライン + に従っていることを期待されています。大部分のメンテナーはこれらのルー + ルに従っているものだけを受け付け、多くの人は正しいスタイルのコード + だけをレビューします。 + + Documentation/SubmittingPatches + Documentation/SubmittingDrivers + これらのファイルには、どうやってうまくパッチを作って投稿するかに + ついて非常に詳しく書かれており、以下を含みます(これだけに限らない + けれども) + - Email に含むこと + - Email の形式 + - だれに送るか + これらのルールに従えばうまくいくことを保証することではありません + が (すべてのパッチは内容とスタイルについて精査を受けるので)、 + ルールに従わなければ間違いなくうまくいかないでしょう。 + この他にパッチを作る方法についてのよくできた記述は- + + "The Perfect Patch" + http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt + "Linux kernel patch submission format" + http://linux.yyz.us/patch-format.html + + Documentation/stable_api_nonsense.txt + このファイルはカーネルの中に不変のAPIを持たないことにした意識的な + 決断の背景にある理由について書かれています。以下のようなことを含 + んでいます- + - サブシステムとの間に層を作ること(コンパチビリティのため?) + - オペレーティングシステム間のドライバの移植性 + - カーネルソースツリーの素早い変更を遅らせる(もしくは素早い変更 + を妨げる) + このドキュメントは Linux 開発の思想を理解するのに非常に重要です。 + そして、他のOSでの開発者が Linux に移る時にとても重要です。 + + Documentation/SecurityBugs + もし Linux カーネルでセキュリティ問題を発見したように思ったら、こ + のドキュメントのステップに従ってカーネル開発者に連絡し、問題解決を + 支援してください。 + + Documentation/ManagementStyle + このドキュメントは Linux カーネルのメンテナー達がどう行動するか、 + 彼らの手法の背景にある共有されている精神について記述しています。こ + れはカーネル開発の初心者なら(もしくは、単に興味があるだけの人でも) + 重要です。なぜならこのドキュメントは、カーネルメンテナー達の独特な + 行動についての多くの誤解や混乱を解消するからです。 + + Documentation/stable_kernel_rules.txt + このファイルはどのように stable カーネルのリリースが行われるかのルー + ルが記述されています。そしてこれらのリリースの中のどこかで変更を取 + り入れてもらいたい場合に何をすればいいかが示されています。 + + Documentation/kernel-docs.txt + カーネル開発に付随する外部ドキュメントのリストです。もしあなたが + 探しているものがカーネル内のドキュメントでみつからなかった場合、 + このリストをあたってみてください。 + + Documentation/applying-patches.txt + パッチとはなにか、パッチをどうやって様々なカーネルの開発ブランチに + 適用するのかについて正確に記述した良い入門書です。 + +カーネルはソースコードから自動的に生成可能な多数のドキュメントを自分自 +身でもっています。これにはカーネル内 API のすべての記述や、どう正しく +ロックをかけるかの規則が含まれます。このドキュメントは +Documentation/DocBook/ ディレクトリに作られ、以下のように + make pdfdocs + make psdocs + make htmldocs + make mandocs +コマンドを実行するとメインカーネルのソースディレクトリから +それぞれ、PDF, Postscript, HTML, man page の形式で生成されます。 + +カーネル開発者になるには +--------------------------- + +もしあなたが、Linux カーネル開発について何も知らないならば、 +KernelNewbies プロジェクトを見るべきです + http://kernelnewbies.org + +このサイトには役に立つメーリングリストがあり、基本的なカーネル開発に関 +するほとんどどんな種類の質問もできます (既に回答されているようなことを +聞く前にまずはアーカイブを調べてください)。 +またここには、リアルタイムで質問を聞くことができる IRC チャネルや、Linux +カーネルの開発に関して学ぶのに便利なたくさんの役に立つドキュメントがあ +ります。 + +web サイトには、コードの構成、サブシステム、現在存在するプロジェクト(ツ +リーにあるもの無いものの両方)の基本的な管理情報があります。 +ここには、また、カーネルのコンパイルのやり方やパッチの当て方などの間接 +的な基本情報も記述されています。 + +あなたがどこからスタートしてよいかわからないが、Linux カーネル開発コミュ +ニティに参加して何かすることをさがしている場合には、Linux kernel +Janitor's プロジェクトにいけばよいでしょう - + http://janitor.kernelnewbies.org/ +ここはそのようなスタートをするのにうってつけの場所です。ここには、 +Linux カーネルソースツリーの中に含まれる、きれいにし、修正しなければな +らない、単純な問題のリストが記述されています。このプロジェクトに関わる +開発者と一緒に作業することで、あなたのパッチを Linuxカーネルツリーに入 +れるための基礎を学ぶことができ、そしてもしあなたがまだアイディアを持っ +ていない場合には、次にやる仕事の方向性が見えてくるかもしれません。 + +もしあなたが、すでにひとまとまりコードを書いていて、カーネルツリーに入 +れたいと思っていたり、それに関する適切な支援を求めたい場合、カーネル +メンターズプロジェクトはそのような皆さんを助けるためにできました。 +ここにはメーリングリストがあり、以下から参照できます + http://selenic.com/mailman/listinfo/kernel-mentors + +実際に Linux カーネルのコードについて修正を加える前に、どうやってその +コードが動作するのかを理解することが必要です。そのためには、特別なツー +ルの助けを借りてでも、それを直接よく読むことが最良の方法です(ほとんど +のトリッキーな部分は十分にコメントしてありますから)。そういうツールで +特におすすめなのは、Linux クロスリファレンスプロジェクトです。これは、 +自己参照方式で、索引がついた web 形式で、ソースコードを参照することが +できます。この最新の素晴しいカーネルコードのリポジトリは以下で見つかり +ます- + http://sosdg.org/~coywolf/lxr/ + +開発プロセス +----------------------- + +Linux カーネルの開発プロセスは現在幾つかの異なるメインカーネル「ブラン +チ」と多数のサブシステム毎のカーネルブランチから構成されます。 +これらのブランチとは- + - メインの 2.6.x カーネルツリー + - 2.6.x.y -stable カーネルツリー + - 2.6.x -git カーネルパッチ + - 2.6.x -mm カーネルパッチ + - サブシステム毎のカーネルツリーとパッチ + +2.6.x カーネルツリー +----------------- + +2.6.x カーネルは Linus Torvalds によってメンテナンスされ、kernel.org +の pub/linux/kernel/v2.6/ ディレクトリに存在します。この開発プロセスは +以下のとおり- + + - 新しいカーネルがリリースされた直後に、2週間の特別期間が設けられ、 + この期間中に、メンテナー達は Linus に大きな差分を送ることができま + す。このような差分は通常 -mm カーネルに数週間含まれてきたパッチで + す。 大きな変更は git(カーネルのソース管理ツール、詳細は + http://git.or.cz/ 参照) を使って送るのが好ましいやり方ですが、パッ + チファイルの形式のまま送るのでも十分です。 + + - 2週間後、-rc1 カーネルがリリースされ、この後にはカーネル全体の安定 + 性に影響をあたえるような新機能は含まない類のパッチしか取り込むこと + はできません。新しいドライバ(もしくはファイルシステム)のパッチは + -rc1 の後で受け付けられることもあることを覚えておいてください。な + ぜなら、変更が独立していて、追加されたコードの外の領域に影響を与え + ない限り、退行のリスクは無いからです。-rc1 がリリースされた後、 + Linus へパッチを送付するのに git を使うこともできますが、パッチは + レビューのために、パブリックなメーリングリストへも同時に送る必要が + あります。 + + - 新しい -rc は Linus が、最新の git ツリーがテスト目的であれば十分 + に安定した状態にあると判断したときにリリースされます。目標は毎週新 + しい -rc カーネルをリリースすることです。 + + - このプロセスはカーネルが 「準備ができた」と考えられるまで継続しま + す。このプロセスはだいたい 6週間継続します。 + +Andrew Morton が Linux-kernel メーリングリストにカーネルリリースについ +て書いたことをここで言っておくことは価値があります- + 「カーネルがいつリリースされるかは誰も知りません。なぜなら、これは現 + 実に認識されたバグの状況によりリリースされるのであり、前もって決めら + れた計画によってリリースされるものではないからです。」 + +2.6.x.y -stable カーネルツリー +--------------------------- + +バージョンに4つ目の数字がついたカーネルは -stable カーネルです。これに +は、2.6.x カーネルで見つかったセキュリティ問題や重大な後戻りに対する比 +較的小さい重要な修正が含まれます。 + +これは、開発/実験的バージョンのテストに協力することに興味が無く、 +最新の安定したカーネルを使いたいユーザに推奨するブランチです。 + +もし、2.6.x.y カーネルが存在しない場合には、番号が一番大きい 2.6.x +が最新の安定版カーネルです。 + +2.6.x.y は "stable" チーム <stable@kernel.org> でメンテされており、だ +いたい隔週でリリースされています。 + +カーネルツリーに入っている、Documentation/stable_kernel_rules.txt ファ +イルにはどのような種類の変更が -stable ツリーに受け入れ可能か、またリ +リースプロセスがどう動くかが記述されています。 + +2.6.x -git パッチ +------------------ + +git リポジトリで管理されているLinus のカーネルツリーの毎日のスナップ +ショットがあります。(だから -git という名前がついています)。これらのパッ +チはおおむね毎日リリースされており、Linus のツリーの現状を表します。こ +れは -rc カーネルと比べて、パッチが大丈夫かどうかも確認しないで自動的 +に生成されるので、より実験的です。 + +2.6.x -mm カーネルパッチ +------------------------ + +Andrew Morton によってリリースされる実験的なカーネルパッチ群です。 +Andrew は個別のサブシステムカーネルツリーとパッチを全て集めてきて +linux-kernel メーリングリストで収集された多数のパッチと同時に一つにま +とめます。 +このツリーは新機能とパッチが検証される場となります。ある期間の間パッチ +が -mm に入って価値を証明されたら、Andrew やサブシステムメンテナが、メ +インラインへ入れるように Linus にプッシュします。 + +メインカーネルツリーに含めるために Linus に送る前に、すべての新しいパッ +チが -mm ツリーでテストされることが強く推奨されます。 + +これらのカーネルは安定して動作すべきシステムとして使うのには適切ではあ +りませんし、カーネルブランチの中でももっとも動作にリスクが高いものです。 + +もしあなたが、カーネル開発プロセスの支援をしたいと思っているのであれば、 +どうぞこれらのカーネルリリースをテストに使ってみて、そしてもし問題があ +れば、またもし全てが正しく動作したとしても、linux-kernel メーリングリ +ストにフィードバックを提供してください。 + +すべての他の実験的パッチに加えて、これらのカーネルは通常リリース時点で +メインラインの -git カーネルに含まれる全ての変更も含んでいます。 + +-mm カーネルは決まったスケジュールではリリースされません、しかし通常幾 +つかの -mm カーネル (1 から 3 が普通)が各-rc カーネルの間にリリースさ +れます。 + +サブシステム毎のカーネルツリーとパッチ +------------------------------------------- + +カーネルの様々な領域で何が起きているかを見られるようにするため、多くの +カーネルサブシステム開発者は彼らの開発ツリーを公開しています。これらの +ツリーは説明したように -mm カーネルリリースに入れ込まれます。 + +以下はさまざまなカーネルツリーの中のいくつかのリスト- + + git ツリー- + - Kbuild の開発ツリー、Sam Ravnborg <sam@ravnborg.org> + kernel.org:/pub/scm/linux/kernel/git/sam/kbuild.git + + - ACPI の開発ツリー、 Len Brown <len.brown@intel.com> + kernel.org:/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6.git + + - Block の開発ツリー、Jens Axboe <axboe@suse.de> + kernel.org:/pub/scm/linux/kernel/git/axboe/linux-2.6-block.git + + - DRM の開発ツリー、Dave Airlie <airlied@linux.ie> + kernel.org:/pub/scm/linux/kernel/git/airlied/drm-2.6.git + + - ia64 の開発ツリー、Tony Luck <tony.luck@intel.com> + kernel.org:/pub/scm/linux/kernel/git/aegl/linux-2.6.git + + - ieee1394 の開発ツリー、Jody McIntyre <scjody@modernduck.com> + kernel.org:/pub/scm/linux/kernel/git/scjody/ieee1394.git + + - infiniband, Roland Dreier <rolandd@cisco.com> + kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git + + - libata, Jeff Garzik <jgarzik@pobox.com> + kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev.git + + - ネットワークドライバ, Jeff Garzik <jgarzik@pobox.com> + kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git + + - pcmcia, Dominik Brodowski <linux@dominikbrodowski.net> + kernel.org:/pub/scm/linux/kernel/git/brodo/pcmcia-2.6.git + + - SCSI, James Bottomley <James.Bottomley@SteelEye.com> + kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6.git + + その他の git カーネルツリーは http://kernel.org/git に一覧表がありま + す。 + + quilt ツリー- + - USB, PCI ドライバコアと I2C, Greg Kroah-Hartman <gregkh@suse.de> + kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/ + +バグレポート +------------- + +bugzilla.kernel.org は Linux カーネル開発者がカーネルのバグを追跡する +場所です。ユーザは見つけたバグの全てをこのツールで報告すべきです。 +どう kernel bugzilla を使うかの詳細は、以下を参照してください- + http://test.kernel.org/bugzilla/faq.html + +メインカーネルソースディレクトリにあるファイル REPORTING-BUGS はカーネ +ルバグらしいものについてどうレポートするかの良いテンプレートであり、問 +題の追跡を助けるためにカーネル開発者にとってどんな情報が必要なのかの詳 +細が書かれています。 + +メーリングリスト +------------- + +上のいくつかのドキュメントで述べていますが、コアカーネル開発者の大部分 +は Linux kernel メーリングリストに参加しています。このリストの登録/脱 +退の方法については以下を参照してください- + http://vger.kernel.org/vger-lists.html#linux-kernel + +このメーリングリストのアーカイブは web 上の多数の場所に存在します。こ +れらのアーカイブを探すにはサーチエンジンを使いましょう。例えば- + http://dir.gmane.org/gmane.linux.kernel + +リストに投稿する前にすでにその話題がアーカイブに存在するかどうかを検索 +することを是非やってください。多数の事がすでに詳細に渡って議論されて +おり、アーカイブにのみ記録されています。 + +大部分のカーネルサブシステムも自分の個別の開発を実施するメーリングリス +トを持っています。個々のグループがどんなリストを持っているかは、 +MAINTAINERS ファイルにリストがありますので参照してください。 + +多くのリストは kernel.org でホストされています。これらの情報は以下にあ +ります- + http://vger.kernel.org/vger-lists.html + +メーリングリストを使う場合、良い行動習慣に従うようにしましょう。 +少し安っぽいが、以下の URL は上のリスト(や他のリスト)で会話する場合の +シンプルなガイドラインを示しています- + http://www.albion.com/netiquette/ + +もし複数の人があなたのメールに返事をした場合、CC: で受ける人のリストは +だいぶ多くなるでしょう。良い理由がない場合、CC: リストから誰かを削除を +しないように、また、メーリングリストのアドレスだけにリプライすることの +ないようにしましょう。1つは送信者から、もう1つはリストからのように、メー +ルを2回受けることになってもそれに慣れ、しゃれたメールヘッダーを追加し +てこの状態を変えようとしないように。人々はそのようなことは好みません。 + +今までのメールでのやりとりとその間のあなたの発言はそのまま残し、 +"John Kernlehacker wrote ...:" の行をあなたのリプライの先頭行にして、 +メールの先頭でなく、各引用行の間にあなたの言いたいことを追加するべきで +す。 + +もしパッチをメールに付ける場合は、Documentaion/SubmittingPatches に提 +示されているように、それは プレーンな可読テキストにすることを忘れない +ようにしましょう。カーネル開発者は 添付や圧縮したパッチを扱いたがりま +せん- +彼らはあなたのパッチの行毎にコメントを入れたいので、そのためにはそうす +るしかありません。あなたのメールプログラムが空白やタブを圧縮しないよう +に確認した方がいいです。最初の良いテストとしては、自分にメールを送って +みて、そのパッチを自分で当ててみることです。もしそれがうまく行かないな +ら、あなたのメールプログラムを直してもらうか、正しく動くように変えるべ +きです。 + +とりわけ、他の登録者に対する尊敬を表すようにすることを覚えておいてくだ +さい。 + +コミュニティと共に働くこと +-------------------------- + +カーネルコミュニティのゴールは可能なかぎり最高のカーネルを提供すること +です。あなたがパッチを受け入れてもらうために投稿した場合、それは、技術 +的メリットだけがレビューされます。その際、あなたは何を予想すべきでしょ +うか? + - 批判 + - コメント + - 変更の要求 + - パッチの正当性の証明要求 + - 沈黙 + +思い出してください、ここはあなたのパッチをカーネルに入れる話です。あ +なたは、あなたのパッチに対する批判とコメントを受け入れるべきで、それら +を技術的レベルで評価して、パッチを再作成するか、なぜそれらの変更をすべ +きでないかを明確で簡潔な理由の説明を提供してください。 +もし、あなたのパッチに何も反応がない場合、たまにはメールの山に埋もれて +見逃され、あなたの投稿が忘れられてしまうこともあるので、数日待って再度 +投稿してください。 + +あなたがやるべきでないものは? + - 質問なしにあなたのパッチが受け入れられると想像すること + - 守りに入ること + - コメントを無視すること + - 要求された変更を何もしないでパッチを出し直すこと + +可能な限り最高の技術的解決を求めているコミュニティでは、パッチがどのく +らい有益なのかについては常に異なる意見があります。あなたは協調的である +べきですし、また、あなたのアイディアをカーネルに対してうまく合わせるよ +うにすることが望まれています。もしくは、最低限あなたのアイディアがそれ +だけの価値があるとすすんで証明するようにしなければなりません。 +正しい解決に向かって進もうという意志がある限り、間違うことがあっても許 +容されることを忘れないでください。 + +あなたの最初のパッチに単に 1ダースもの修正を求めるリストの返答になるこ +とも普通のことです。これはあなたのパッチが受け入れられないということで +は *ありません*、そしてあなた自身に反対することを意味するのでも *ありま +せん*。単に自分のパッチに対して指摘された問題を全て修正して再送すれば +いいのです。 + +カーネルコミュニティと企業組織のちがい +----------------------------------------------------------------- + +カーネルコミュニティは大部分の伝統的な会社の開発環境とは異ったやり方で +動いています。以下は問題を避けるためにできるとよいことののリストです- + + あなたの提案する変更について言うときのうまい言い方: + + - "これは複数の問題を解決します" + - "これは2000行のコードを削除します" + - "以下のパッチは、私が言おうとしていることを説明するものです" + - "私はこれを5つの異なるアーキテクチャでテストしたのですが..." + - "以下は一連の小さなパッチ群ですが..." + - "これは典型的なマシンでの性能を向上させます.." + + やめた方がいい悪い言い方: + + - このやり方で AIX/ptx/Solaris ではできたので、できるはずだ + - 私はこれを20年もの間やってきた、だから + - これは、私の会社が金儲けをするために必要だ + - これは我々のエンタープライズ向け商品ラインのためである + - これは 私が自分のアイディアを記述した、1000ページの設計資料である + - 私はこれについて、6ケ月作業している。 + - 以下は ... に関する5000行のパッチです + - 私は現在のぐちゃぐちゃを全部書き直した、それが以下です... + - 私は〆切がある、そのためこのパッチは今すぐ適用される必要がある + +カーネルコミュニティが大部分の伝統的なソフトウェアエンジニアリングの労 +働環境と異なるもう一つの点は、やりとりに顔を合わせないということです。 +email と irc を第一のコミュニケーションの形とする一つの利点は、性別や +民族の差別がないことです。Linux カーネルの職場環境は女性や少数民族を受 +容します。なぜなら、email アドレスによってのみあなたが認識されるからで +す。 +国際的な側面からも活動領域を均等にするようにします。なぜならば、あなた +は人の名前で性別を想像できないからです。ある男性が アンドレアという名 +前で、女性の名前は パット かもしれません (訳注 Andrea は米国では女性、 +それ以外(欧州など)では男性名として使われることが多い。同様に、Pat は +Patricia (主に女性名)や Patrick (主に男性名)の略称)。 +Linux カーネルの活動をして、意見を表明したことがある大部分の女性は、前 +向きな経験をもっています。 + +言葉の壁は英語が得意でない一部の人には問題になります。 +メーリングリストの中できちんとアイディアを交換するには、相当うまく英語 +を操れる必要があることもあります。そのため、あなたは自分のメール +を送る前に英語で意味が通じているかをチェックすることをお薦めします。 + +変更を分割する +--------------------- + +Linux カーネルコミュニティは、一度に大量のコードの塊を喜んで受容するこ +とはありません。変更は正確に説明される必要があり、議論され、小さい、個 +別の部分に分割する必要があります。これはこれまで多くの会社がやり慣れて +きたことと全く正反対のことです。あなたのプロポーザルは、開発プロセスのと +ても早い段階から紹介されるべきです。そうすれば あなたは自分のやってい +ることにフィードバックを得られます。これは、コミュニティからみれば、あ +なたが彼らと一緒にやっているように感じられ、単にあなたの提案する機能の +ゴミ捨て場として使っているのではない、と感じられるでしょう。 +しかし、一度に 50 もの email をメーリングリストに送りつけるようなことは +やってはいけません、あなたのパッチ群はいつもどんな時でもそれよりは小さ +くなければなりません。 + +パッチを分割する理由は以下です- + +1) 小さいパッチはあなたのパッチが適用される見込みを大きくします、カー + ネルの人達はパッチが正しいかどうかを確認する時間や労力をかけないか + らです。5行のパッチはメンテナがたった1秒見るだけで適用できます。し + かし、500行のパッチは、正しいことをレビューするのに数時間かかるかも + しれません(時間はパッチのサイズなどにより指数関数に比例してかかりま + す) + 小さいパッチは何かあったときにデバッグもとても簡単になります。パッ + チを1個1個取り除くのは、とても大きなパッチを当てた後に(かつ、何かお + かしくなった後で)解剖するのに比べればとても簡単です。 + +2) 小さいパッチを送るだけでなく、送るまえに、書き直して、シンプルにす + る(もしくは、単に順番を変えるだけでも)ことも、とても重要です。 + +以下はカーネル開発者の Al Viro のたとえ話しです: + + "生徒の数学の宿題を採点する先生のことを考えてみてください、先 + 生は生徒が解に到達するまでの試行錯誤をみたいとは思わないでしょ + う。先生は簡潔な最高の解をみたいのです。良い生徒はこれを知って + おり、そして最終解の前の中間作業を提出することは決してないので + す" + カーネル開発でもこれは同じです。メンテナー達とレビューア達は、 + 問題を解決する解の背後になる思考プロセスをみたいとは思いません。 + 彼らは単純であざやかな解決方法をみたいのです。 + +あざやかな解を説明するのと、コミュニティと共に仕事をし、未解決の仕事を +議論することのバランスをキープするのは難しいかもしれません。 +ですから、開発プロセスの早期段階で改善のためのフィードバックをもらうよ +うにするのもいいですが、変更点を小さい部分に分割して全体ではまだ完成し +ていない仕事を(部分的に)取り込んでもらえるようにすることもいいことです。 + +また、でき上がっていないものや、"将来直す" ようなパッチを、本流に含め +てもらうように送っても、それは受け付けられないことを理解してください。 + +あなたの変更を正当化する +------------------- + +あなたのパッチを分割するのと同時に、なぜその変更を追加しなければならな +いかを Linux コミュニティに知らせることはとても重要です。新機能は必要 +性と有用性で正当化されなければなりません。 + +あなたの変更の説明 +-------------------- + +あなたのパッチを送付する場合には、メールの中のテキストで何を言うかにつ +いて、特別に注意を払ってください。この情報はパッチの ChangeLog に使わ +れ、いつも皆がみられるように保管されます。これは次のような項目を含め、 +パッチを完全に記述するべきです- + + - なぜ変更が必要か + - パッチ全体の設計アプローチ + - 実装の詳細 + - テスト結果 + +これについて全てがどのようにあるべきかについての詳細は、以下のドキュメ +ントの ChangeLog セクションをみてください- + "The Perfect Patch" + http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt + +これらのどれもが、時にはとても困難です。これらの慣例を完璧に実施するに +は数年かかるかもしれません。これは継続的な改善のプロセスであり、そのた +めには多数の忍耐と決意を必要とするものです。でも、諦めないで、これは可 +能なことです。多数の人がすでにできていますし、彼らも皆最初はあなたと同 +じところからスタートしたのですから。 + +Paolo Ciarrocchi に感謝、彼は彼の書いた "Development Process" +(http://linux.tar.bz/articles/2.6-development_process)セクショ +ンをこのテキストの原型にすることを許可してくれました。 +Rundy Dunlap と Gerrit Huizenga はメーリングリストでやるべきこととやっ +てはいけないことのリストを提供してくれました。 +以下の人々のレビュー、コメント、貢献に感謝。 +Pat Mochel, Hanna Linder, Randy Dunlap, Kay Sievers, +Vojtech Pavlik, Jan Kara, Josh Boyer, Kees Cook, Andrew Morton, Andi +Kleen, Vadim Lobanov, Jesper Juhl, Adrian Bunk, Keri Harris, Frans Pop, +David A. Wheeler, Junio Hamano, Michael Kerrisk, と Alex Shepard +彼らの支援なしでは、このドキュメントはできなかったでしょう。 + +Maintainer: Greg Kroah-Hartman <greg@kroah.com> diff --git a/Documentation/ja_JP/stable_api_nonsense.txt b/Documentation/ja_JP/stable_api_nonsense.txt new file mode 100644 index 000000000000..b3f2b27f0881 --- /dev/null +++ b/Documentation/ja_JP/stable_api_nonsense.txt @@ -0,0 +1,263 @@ +NOTE: +This is a Japanese translated version of +"Documentation/stable_api_nonsense.txt". +This one is maintained by +IKEDA, Munehiro <m-ikeda@ds.jp.nec.com> +and JF Project team <http://www.linux.or.jp/JF/>. +If you find difference with original file or problem in translation, +please contact the maintainer of this file or JF project. + +Please also note that purpose of this file is easier to read for non +English natives and not to be intended to fork. So, if you have any +comments or updates of this file, please try to update +Original(English) file at first. + +================================== +これは、 +linux-2.6.22-rc4/Documentation/stable_api_nonsense.txt の和訳 +です。 +翻訳団体: JF プロジェクト < http://www.linux.or.jp/JF/ > +翻訳日 : 2007/06/11 +原著作者: Greg Kroah-Hartman < greg at kroah dot com > +翻訳者 : 池田 宗広 < m-ikeda at ds dot jp dot nec dot com > +校正者 : Masanori Kobayashi さん < zap03216 at nifty dot ne dot jp > + Seiji Kaneko さん < skaneko at a2 dot mbn dot or dot jp > +================================== + + + +Linux カーネルのドライバインターフェース +(あなたの質問すべてに対する回答とその他諸々) + +Greg Kroah-Hartman <greg at kroah dot com> + + +この文書は、なぜ Linux ではバイナリカーネルインターフェースが定義 +されていないのか、またはなぜ不変のカーネルインターフェースを持たな +いのか、ということを説明するために書かれた。ここでの話題は「カーネ +ル内部の」インターフェースについてであり、ユーザー空間とのインター +フェースではないことを理解してほしい。カーネルとユーザー空間とのイ +ンターフェースとはアプリケーションプログラムが使用するものであり、 +つまりシステムコールのインターフェースがこれに当たる。これは今まで +長きに渡り、かつ今後も「まさしく」不変である。私は確か 0.9 か何か +より前のカーネルを使ってビルドした古いプログラムを持っているが、そ +れは最新の 2.6 カーネルでもきちんと動作する。ユーザー空間とのイン +ターフェースは、ユーザーとアプリケーションプログラマが不変性を信頼 +してよいものの一つである。 + + +要旨 +---- + +あなたは不変のカーネルインターフェースが必要だと考えているかもしれ +ないが、実際のところはそうではない。あなたは必要としているものが分 +かっていない。あなたが必要としているものは安定して動作するドライバ +であり、それはドライバがメインのカーネルツリーに含まれる場合のみ得 +ることができる。ドライバがメインのカーネルツリーに含まれていると、 +他にも多くの良いことがある。それは、Linux をより強固で、安定な、成 +熟したオペレーティングシステムにすることができるということだ。これ +こそ、そもそもあなたが Linux を使う理由のはずだ。 + + +はじめに +-------- + +カーネル内部のインターフェース変更を心配しなければならないドライバ +を書きたいなどというのは、変わり者だけだ。この世界のほとんどの人は、 +そのようなドライバがどんなインターフェースを使っているかなど知らな +いし、そんなドライバのことなど全く気にもかけていない。 + + +まず初めに、クローズソースとか、ソースコードの隠蔽とか、バイナリの +みが配布される使い物にならない代物[訳注(1)]とか、実体はバイナリ +コードでそれを読み込むためのラッパー部分のみソースコードが公開され +ているとか、その他用語は何であれ GPL の下にソースコードがリリース +されていないカーネルドライバに関する法的な問題について、私は「いか +なる議論も」行うつもりがない。法的な疑問があるのならば、プログラマ +である私ではなく、弁護士に相談して欲しい。ここでは単に、技術的な問 +題について述べることにする。(法的な問題を軽視しているわけではない。 +それらは実際に存在するし、あなたはそれをいつも気にかけておく必要が +ある) + +訳注(1) +「使い物にならない代物」の原文は "blob" + + +さてここでは、バイナリカーネルインターフェースについてと、ソースレ +ベルでのインターフェースの不変性について、という二つの話題を取り上 +げる。この二つは互いに依存する関係にあるが、まずはバイナリインター +フェースについて議論を行いやっつけてしまおう。 + + +バイナリカーネルインターフェース +-------------------------------- + +もしソースレベルでのインターフェースが不変ならば、バイナリインター +フェースも当然のように不変である、というのは正しいだろうか?正しく +ない。Linux カーネルに関する以下の事実を考えてみてほしい。 + - あなたが使用するCコンパイラのバージョンによって、カーネル内部 + の構造体の配置構造は異なったものになる。また、関数は異なった方 + 法でカーネルに含まれることになるかもしれない(例えばインライン + 関数として扱われたり、扱われなかったりする)。個々の関数がどの + ようにコンパイルされるかはそれほど重要ではないが、構造体のパデ + ィングが異なるというのは非常に重要である。 + - あなたがカーネルのビルドオプションをどのように設定するかによっ + て、カーネルには広い範囲で異なった事態が起こり得る。 + - データ構造は異なるデータフィールドを持つかもしれない + - いくつかの関数は全く実装されていない状態になり得る + (例:SMP向けではないビルドでは、いくつかのロックは中身が + カラにコンパイルされる) + - カーネル内のメモリは、異なった方法で配置され得る。これはビ + ルドオプションに依存している。 + - Linux は様々な異なるプロセッサアーキテクチャ上で動作する。 + あるアーキテクチャ用のバイナリドライバを、他のアーキテクチャで + 正常に動作させる方法はない。 + + +ある特定のカーネル設定を使用し、カーネルをビルドしたのと正確に同じ +Cコンパイラを使用して単にカーネルモジュールをコンパイルするだけで +も、あなたはこれらいくつもの問題に直面することになる。ある特定の +Linux ディストリビューションの、ある特定のリリースバージョン用にモ +ジュールを提供しようと思っただけでも、これらの問題を引き起こすには +十分である。にも関わらず Linux ディストリビューションの数と、サ +ポートするディストリビューションのリリース数を掛け算し、それら一つ +一つについてビルドを行ったとしたら、今度はリリースごとのビルドオプ +ションの違いという悪夢にすぐさま悩まされることになる。また、ディス +トリビューションの各リリースバージョンには、異なるハードウェア(プ +ロセッサタイプや種々のオプション)に対応するため、何種類かのカーネ +ルが含まれているということも理解して欲しい。従って、ある一つのリ +リースバージョンだけのためにモジュールを作成する場合でも、あなたは +何バージョンものモジュールを用意しなければならない。 + + +信じて欲しい。このような方法でサポートを続けようとするなら、あなた +はいずれ正気を失うだろう。遠い昔、私はそれがいかに困難なことか、身 +をもって学んだのだ・・・ + + +不変のカーネルソースレベルインターフェース +------------------------------------------ + +メインカーネルツリーに含まれていない Linux カーネルドライバを継続 +してサポートしていこうとしている人たちとの議論においては、これは極 +めて「引火性の高い」話題である。[訳注(2)] + +訳注(2) +「引火性の高い」の原文は "volatile"。 +volatile には「揮発性の」「爆発しやすい」という意味の他、「変わり +やすい」「移り気な」という意味がある。 +「(この話題は)爆発的に激しい論争を巻き起こしかねない」ということ +を、「(カーネルのソースレベルインターフェースは)移ろい行くもので +ある」ということを連想させる "volatile" という単語で表現している。 + + +Linux カーネルの開発は継続的に速いペースで行われ、決して歩みを緩め +ることがない。その中でカーネル開発者達は、現状のインターフェースに +あるバグを見つけ、より良い方法を考え出す。彼らはやがて、現状のイン +ターフェースがより正しく動作するように修正を行う。その過程で関数の +名前は変更されるかもしれず、構造体は大きく、または小さくなるかもし +れず、関数の引数は検討しなおされるかもしれない。そのような場合、引 +き続き全てが正常に動作するよう、カーネル内でこれらのインターフェー +スを使用している個所も全て同時に修正される。 + + +具体的な例として、カーネル内の USB インターフェースを挙げる。USB +サブシステムはこれまでに少なくとも3回の書き直しが行われ、その結果 +インターフェースが変更された。これらの書き直しはいくつかの異なった +問題を修正するために行われた。 + - 同期的データストリームが非同期に変更された。これにより多数のド + ライバを単純化でき、全てのドライバのスループットが向上した。今 + やほとんど全ての USB デバイスは、考えられる最高の速度で動作し + ている。 + - USB ドライバが USB サブシステムのコアから行う、データパケット + 用のメモリ確保方法が変更された。これに伴い、いくつもの文書化さ + れたデッドロック条件を回避するため、全ての USB ドライバはより + 多くの情報を USB コアに提供しなければならないようになっている。 + + +このできごとは、数多く存在するクローズソースのオペレーティングシス +テムとは全く対照的だ。それらは長期に渡り古い USB インターフェース +をメンテナンスしなければならない。古いインターフェースが残ることで、 +新たな開発者が偶然古いインターフェースを使い、正しくない方法で開発 +を行ってしまう可能性が生じる。これによりシステムの安定性は危険にさ +らされることになる。 + + +上に挙げたどちらの例においても、開発者達はその変更が重要かつ必要で +あることに合意し、比較的楽にそれを実行した。もし Linux がソースレ +ベルでインターフェースの不変性を保証しなければならないとしたら、新 +しいインターフェースを作ると同時に、古い、問題のある方を今後ともメ +ンテナンスするという余計な仕事を USB の開発者にさせなければならな +い。Linux の USB 開発者は、自分の時間を使って仕事をしている。よっ +て、価値のない余計な仕事を報酬もなしに実行しろと言うことはできない。 + + +セキュリティ問題も、Linux にとっては非常に重要である。ひとたびセキ +ュリティに関する問題が発見されれば、それは極めて短期間のうちに修正 +される。セキュリティ問題の発生を防ぐための修正は、カーネルの内部イ +ンターフェースの変更を何度も引き起こしてきた。その際同時に、変更さ +れたインターフェースを使用する全てのドライバもまた変更された。これ +により問題が解消し、将来偶然に問題が再発してしまわないことが保証さ +れる。もし内部インターフェースの変更が許されないとしたら、このよう +にセキュリティ問題を修正し、将来再発しないことを保証することなど不 +可能なのだ。 + + +カーネルのインターフェースは時が経つにつれクリーンナップを受ける。 +誰も使っていないインターフェースは削除される。これにより、可能な限 +りカーネルが小さく保たれ、現役の全てのインターフェースが可能な限り +テストされることを保証しているのだ。(使われていないインターフェー +スの妥当性をテストすることは不可能と言っていいだろう) + + + +これから何をすべきか +----------------------- + +では、もしメインのカーネルツリーに含まれない Linux カーネルドライ +バがあったとして、あなたは、つまり開発者は何をするべきだろうか?全 +てのディストリビューションの全てのカーネルバージョン向けにバイナリ +のドライバを供給することは悪夢であり、カーネルインターフェースの変 +更を追いかけ続けることもまた過酷な仕事だ。 + + +答えは簡単。そのドライバをメインのカーネルツリーに入れてしまえばよ +い。(ここで言及しているのは、GPL に従って公開されるドライバのこと +だということに注意してほしい。あなたのコードがそれに該当しないなら +ば、さよなら。幸運を祈ります。ご自分で何とかしてください。Andrew +と Linus からのコメント<Andrew と Linus のコメントへのリンクをこ +こに置く>をどうぞ)ドライバがメインツリーに入れば、カーネルのイン +ターフェースが変更された場合、変更を行った開発者によってドライバも +修正されることになるだろう。あなたはほとんど労力を払うことなしに、 +常にビルド可能できちんと動作するドライバを手に入れることができる。 + + +ドライバをメインのカーネルツリーに入れると、非常に好ましい以下の効 +果がある。 + - ドライバの品質が向上する一方で、(元の開発者にとっての)メンテ + ナンスコストは下がる。 + - あなたのドライバに他の開発者が機能を追加してくれる。 + - 誰かがあなたのドライバにあるバグを見つけ、修正してくれる。 + - 誰かがあなたのドライバにある改善点を見つけてくれる。 + - 外部インターフェースが変更されドライバの更新が必要になった場合、 + 誰かがあなたの代わりに更新してくれる。 + - ドライバを入れてくれとディストロに頼まなくても、そのドライバは + 全ての Linux ディストリビューションに自動的に含まれてリリース + される。 + + +Linux では、他のどのオペレーティングシステムよりも数多くのデバイス +が「そのまま」使用できるようになった。また Linux は、どのオペレー +ティングシステムよりも数多くのプロセッサアーキテクチャ上でそれらの +デバイスを使用することができるようにもなった。このように、Linux の +開発モデルは実証されており、今後も間違いなく正しい方向へと進んでい +くだろう。:) + + + +------ + +この文書の初期の草稿に対し、Randy Dunlap, Andrew Morton, David +Brownell, Hanna Linder, Robert Love, Nishanth Aravamudan から査読 +と助言を頂きました。感謝申し上げます。 + diff --git a/Documentation/kbuild/makefiles.txt b/Documentation/kbuild/makefiles.txt index bb5306e9a5c3..e08ef8759a07 100644 --- a/Documentation/kbuild/makefiles.txt +++ b/Documentation/kbuild/makefiles.txt @@ -501,6 +501,20 @@ more details, with real examples. The third parameter may be a text as in this example, but it may also be an expanded variable or a macro. + cc-fullversion + cc-fullversion is useful when the exact version of gcc is needed. + One typical use-case is when a specific GCC version is broken. + cc-fullversion points out a more specific version than cc-version does. + + Example: + #arch/powerpc/Makefile + $(Q)if test "$(call cc-fullversion)" = "040200" ; then \ + echo -n '*** GCC-4.2.0 cannot compile the 64-bit powerpc ' ; \ + false ; \ + fi + + In this example for a specific GCC version the build will error out explaining + to the user why it stops. === 4 Host Program support diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 7ce5ea949d1a..fb80e9ffea68 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -34,7 +34,6 @@ parameter is applicable: APIC APIC support is enabled. APM Advanced Power Management support is enabled. AX25 Appropriate AX.25 support is enabled. - CD Appropriate CD support is enabled. DRM Direct Rendering Management support is enabled. EDD BIOS Enhanced Disk Drive Services (EDD) is enabled EFI EFI Partitioning (GPT) is enabled @@ -170,7 +169,10 @@ and is between 256 and 4096 characters. It is defined in the file acpi_os_name= [HW,ACPI] Tell ACPI BIOS the name of the OS Format: To spoof as Windows 98: ="Microsoft Windows" - acpi_osi= [HW,ACPI] empty param disables _OSI + acpi_osi= [HW,ACPI] Modify list of supported OS interface strings + acpi_osi="string1" # add string1 -- only one string + acpi_osi="!string2" # remove built-in string2 + acpi_osi= # disable all strings acpi_serialize [HW,ACPI] force serialization of AML methods @@ -235,16 +237,9 @@ and is between 256 and 4096 characters. It is defined in the file Disable PIN 1 of APIC timer Can be useful to work around chipset bugs. - ad1816= [HW,OSS] - Format: <io>,<irq>,<dma>,<dma2> - See also Documentation/sound/oss/AD1816. - ad1848= [HW,OSS] Format: <io>,<irq>,<dma>,<dma2>,<type> - adlib= [HW,OSS] - Format: <io> - advansys= [HW,SCSI] See header of drivers/scsi/advansys.c. @@ -323,9 +318,6 @@ and is between 256 and 4096 characters. It is defined in the file autotest [IA64] - aztcd= [HW,CD] Aztech CD268 CDROM driver - Format: <io>,0x79 (?) - baycom_epp= [HW,AX25] Format: <io>,<mode> @@ -368,10 +360,6 @@ and is between 256 and 4096 characters. It is defined in the file possible to determine what the correct size should be. This option provides an override for these situations. - cdu31a= [HW,CD] - Format: <io>,<irq>[,PAS] - See header of drivers/cdrom/cdu31a.c. - chandev= [HW,NET] Generic channel device initialisation checkreqprot [SELINUX] Set initial checkreqprot flag value. @@ -425,9 +413,6 @@ and is between 256 and 4096 characters. It is defined in the file hpet= [IA-32,HPET] option to disable HPET and use PIT. Format: disable - cm206= [HW,CD] - Format: { auto | [<io>,][<irq>] } - com20020= [HW,NET] ARCnet - COM20020 chipset Format: <io>[,<irq>[,<nodeID>[,<backplane>[,<ckp>[,<timeout>]]]]] @@ -459,13 +444,20 @@ and is between 256 and 4096 characters. It is defined in the file Documentation/networking/netconsole.txt for an alternative. - uart,io,<addr>[,options] - uart,mmio,<addr>[,options] + uart[8250],io,<addr>[,options] + uart[8250],mmio,<addr>[,options] Start an early, polled-mode console on the 8250/16550 UART at the specified I/O port or MMIO address, switching to the matching ttyS device later. The options are the same as for ttyS, above. + earlycon= [KNL] Output early console device and options. + uart[8250],io,<addr>[,options] + uart[8250],mmio,<addr>[,options] + Start an early, polled-mode console on the 8250/16550 + UART at the specified I/O port or MMIO address. + The options are the same as for ttyS, above. + cpcihp_generic= [HW,PCI] Generic port I/O CompactPCI driver Format: <first_slot>,<last_slot>,<port>,<enum_bit>[,<debug>] @@ -657,9 +649,6 @@ and is between 256 and 4096 characters. It is defined in the file gpt [EFI] Forces disk with valid GPT signature but invalid Protective MBR to be treated as GPT. - gscd= [HW,CD] - Format: <io> - gvp11= [HW,SCSI] hashdist= [KNL,NUMA] Large hashes allocated during boot @@ -823,14 +812,37 @@ and is between 256 and 4096 characters. It is defined in the file tasks in the system -- can cause problems and suboptimal load balancer performance. - isp16= [HW,CD] - Format: <io>,<irq>,<dma>,<setup> - iucv= [HW,NET] js= [HW,JOY] Analog joystick See Documentation/input/joystick.txt. + kernelcore=nn[KMG] [KNL,IA-32,IA-64,PPC,X86-64] This parameter + specifies the amount of memory usable by the kernel + for non-movable allocations. The requested amount is + spread evenly throughout all nodes in the system. The + remaining memory in each node is used for Movable + pages. In the event, a node is too small to have both + kernelcore and Movable pages, kernelcore pages will + take priority and other nodes will have a larger number + of kernelcore pages. The Movable zone is used for the + allocation of pages that may be reclaimed or moved + by the page migration subsystem. This means that + HugeTLB pages may not be allocated from this zone. + Note that allocations like PTEs-from-HighMem still + use the HighMem zone if it exists, and the Normal + zone if it does not. + + movablecore=nn[KMG] [KNL,IA-32,IA-64,PPC,X86-64] This parameter + is similar to kernelcore except it specifies the + amount of memory used for migratable allocations. + If both kernelcore and movablecore is specified, + then kernelcore will be at *least* the specified + value but may be more. If movablecore on its own + is specified, the administrator must be careful + that the amount of memory usable for all allocations + is not too small. + keepinitrd [HW,ARM] kstack=N [IA-32,X86-64] Print N words from the kernel stack @@ -964,11 +976,6 @@ and is between 256 and 4096 characters. It is defined in the file mcatest= [IA-64] - mcd= [HW,CD] - Format: <port>,<irq>,<mitsumi_bug_93_wait> - - mcdx= [HW,CD] - mce [IA-32] Machine Check Exception md= [HW] RAID subsystems devices and level @@ -1011,49 +1018,6 @@ and is between 256 and 4096 characters. It is defined in the file mga= [HW,DRM] - migration_cost= - [KNL,SMP] debug: override scheduler migration costs - Format: <level-1-usecs>,<level-2-usecs>,... - This debugging option can be used to override the - default scheduler migration cost matrix. The numbers - are indexed by 'CPU domain distance'. - E.g. migration_cost=1000,2000,3000 on an SMT NUMA - box will set up an intra-core migration cost of - 1 msec, an inter-core migration cost of 2 msecs, - and an inter-node migration cost of 3 msecs. - - WARNING: using the wrong values here can break - scheduler performance, so it's only for scheduler - development purposes, not production environments. - - migration_debug= - [KNL,SMP] migration cost auto-detect verbosity - Format=<0|1|2> - If a system's migration matrix reported at bootup - seems erroneous then this option can be used to - increase verbosity of the detection process. - We default to 0 (no extra messages), 1 will print - some more information, and 2 will be really - verbose (probably only useful if you also have a - serial console attached to the system). - - migration_factor= - [KNL,SMP] multiply/divide migration costs by a factor - Format=<percent> - This debug option can be used to proportionally - increase or decrease the auto-detected migration - costs for all entries of the migration matrix. - E.g. migration_factor=150 will increase migration - costs by 50%. (and thus the scheduler will be less - eager migrating cache-hot tasks) - migration_factor=80 will decrease migration costs - by 20%. (thus the scheduler will be more eager to - migrate tasks) - - WARNING: using the wrong values here can break - scheduler performance, so it's only for scheduler - development purposes, not production environments. - mousedev.tap_time= [MOUSE] Maximum time between finger touching and leaving touchpad surface for touch to be considered @@ -1190,6 +1154,8 @@ and is between 256 and 4096 characters. It is defined in the file nointroute [IA-64] + nojitter [IA64] Disables jitter checking for ITC timers. + nolapic [IA-32,APIC] Do not enable or use the local APIC. nolapic_timer [IA-32,APIC] Do not use the local APIC timer. @@ -1221,6 +1187,8 @@ and is between 256 and 4096 characters. It is defined in the file nosmp [SMP] Tells an SMP kernel to act as a UP kernel. + nosoftlockup [KNL] Disable the soft-lockup detector. + nosync [HW,M68K] Disables sync negotiation for all devices. notsc [BUGS=IA-32] Disable Time Stamp Counter @@ -1229,20 +1197,19 @@ and is between 256 and 4096 characters. It is defined in the file nowb [ARM] + numa_zonelist_order= [KNL, BOOT] Select zonelist order for NUMA. + one of ['zone', 'node', 'default'] can be specified + This can be set from sysctl after boot. + See Documentation/sysctl/vm.txt for details. + nr_uarts= [SERIAL] maximum number of UARTs to be registered. opl3= [HW,OSS] Format: <io> - opl3sa2= [HW,OSS] Format: - <io>,<irq>,<dma>,<dma2>,<mss_io>,<mpu_io>,<ymode>,<loopback>[,<isapnp>,<multiple] - oprofile.timer= [HW] Use timer interrupt instead of performance counters - optcd= [HW,CD] - Format: <io> - osst= [HW,SCSI] SCSI Tape Driver Format: <buffer_size>,<write_threshold> See also Documentation/scsi/st.txt. @@ -1421,6 +1388,15 @@ and is between 256 and 4096 characters. It is defined in the file autoconfiguration. Ranges are in pairs (memory base and size). + print-fatal-signals= + [KNL] debug: print fatal signals + print-fatal-signals=1: print segfault info to + the kernel console. + default: off. + + printk.time= Show timing data prefixed to each printk message line + Format: <bool> (1/Y/y=enable, 0/N/n=disable) + profile= [KNL] Enable kernel profiling via /proc/profile Format: [schedule,]<number> Param: "schedule" - profile schedule points. @@ -1533,6 +1509,10 @@ and is between 256 and 4096 characters. It is defined in the file rootfstype= [KNL] Set root filesystem type + rootwait [KNL] Wait (indefinitely) for root device to show up. + Useful for devices that are detected asynchronously + (e.g. USB and MMC devices). + rw [KNL] Mount root device read-write on boot S [KNL] Run init in single mode @@ -1545,11 +1525,6 @@ and is between 256 and 4096 characters. It is defined in the file sbni= [NET] Granch SBNI12 leased line adapter - sbpcd= [HW,CD] Soundblaster CD adapter - Format: <io>,<type> - See a comment before function sbpcd_setup() in - drivers/cdrom/sbpcd.c. - sc1200wdt= [HW,WDT] SC1200 WDT (watchdog) driver Format: <io>[,<timeout>[,<isapnp>]] @@ -1602,41 +1577,41 @@ and is between 256 and 4096 characters. It is defined in the file simeth= [IA-64] simscsi= - sjcd= [HW,CD] - Format: <io>,<irq>,<dma> - See header of drivers/cdrom/sjcd.c. - slram= [HW,MTD] - slub_debug [MM, SLUB] - Enabling slub_debug allows one to determine the culprit - if slab objects become corrupted. Enabling slub_debug - creates guard zones around objects and poisons objects - when not in use. Also tracks the last alloc / free. - For more information see Documentation/vm/slub.txt. + slub_debug[=options[,slabs]] [MM, SLUB] + Enabling slub_debug allows one to determine the + culprit if slab objects become corrupted. Enabling + slub_debug can create guard zones around objects and + may poison objects when not in use. Also tracks the + last alloc / free. For more information see + Documentation/vm/slub.txt. slub_max_order= [MM, SLUB] - Determines the maximum allowed order for slabs. Setting - this too high may cause fragmentation. - For more information see Documentation/vm/slub.txt. + Determines the maximum allowed order for slabs. + A high setting may cause OOMs due to memory + fragmentation. For more information see + Documentation/vm/slub.txt. slub_min_objects= [MM, SLUB] - The minimum objects per slab. SLUB will increase the - slab order up to slub_max_order to generate a - sufficiently big slab to satisfy the number of objects. - The higher the number of objects the smaller the overhead - of tracking slabs. + The minimum number of objects per slab. SLUB will + increase the slab order up to slub_max_order to + generate a sufficiently large slab able to contain + the number of objects indicated. The higher the number + of objects the smaller the overhead of tracking slabs + and the less frequently locks need to be acquired. For more information see Documentation/vm/slub.txt. slub_min_order= [MM, SLUB] Determines the mininum page order for slabs. Must be - lower than slub_max_order + lower than slub_max_order. For more information see Documentation/vm/slub.txt. slub_nomerge [MM, SLUB] - Disable merging of slabs of similar size. May be + Disable merging of slabs with similar size. May be necessary if there is some reason to distinguish - allocs to different slabs. + allocs to different slabs. Debug options disable + merging on their own. For more information see Documentation/vm/slub.txt. smart2= [HW] @@ -1778,9 +1753,6 @@ and is between 256 and 4096 characters. It is defined in the file snd-ymfpci= [HW,ALSA] - sonycd535= [HW,CD] - Format: <io>[,<irq>] - sonypi.*= [HW] Sony Programmable I/O Control Device driver See Documentation/sonypi.txt @@ -1852,6 +1824,7 @@ and is between 256 and 4096 characters. It is defined in the file Set number of hash buckets for TCP connection time Show timing data prefixed to each printk message line + [deprecated, see 'printk.time'] tipar.timeout= [HW,PPT] Set communications timeout in tenths of a second @@ -1909,11 +1882,14 @@ and is between 256 and 4096 characters. It is defined in the file usbhid.mousepoll= [USBHID] The interval which mice are to be polled at. - vdso= [IA-32,SH] + vdso= [IA-32,SH,x86-64] vdso=2: enable compat VDSO (default with COMPAT_VDSO) vdso=1: enable VDSO (default) vdso=0: disable VDSO mapping + vector= [IA-64,SMP] + vector=percpu: enable percpu vector domain + video= [FB] Frame buffer configuration See Documentation/fb/modedb.txt. diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt index da5404ab7569..cb12ae175aa2 100644 --- a/Documentation/kprobes.txt +++ b/Documentation/kprobes.txt @@ -247,12 +247,6 @@ control to Kprobes.) If the probed function is declared asmlinkage, fastcall, or anything else that affects how args are passed, the handler's declaration must match. -NOTE: A macro JPROBE_ENTRY is provided to handle architecture-specific -aliasing of jp->entry. In the interest of portability, it is advised -to use: - - jp->entry = JPROBE_ENTRY(handler); - register_jprobe() returns 0 on success, or a negative errno otherwise. 4.3 register_kretprobe @@ -518,7 +512,7 @@ long jdo_fork(unsigned long clone_flags, unsigned long stack_start, } static struct jprobe my_jprobe = { - .entry = JPROBE_ENTRY(jdo_fork) + .entry = jdo_fork }; static int __init jprobe_init(void) diff --git a/Documentation/lguest/Makefile b/Documentation/lguest/Makefile new file mode 100644 index 000000000000..b9b9427376e9 --- /dev/null +++ b/Documentation/lguest/Makefile @@ -0,0 +1,27 @@ +# This creates the demonstration utility "lguest" which runs a Linux guest. + +# For those people that have a separate object dir, look there for .config +KBUILD_OUTPUT := ../.. +ifdef O + ifeq ("$(origin O)", "command line") + KBUILD_OUTPUT := $(O) + endif +endif +# We rely on CONFIG_PAGE_OFFSET to know where to put lguest binary. +include $(KBUILD_OUTPUT)/.config +LGUEST_GUEST_TOP := ($(CONFIG_PAGE_OFFSET) - 0x08000000) + +CFLAGS:=-Wall -Wmissing-declarations -Wmissing-prototypes -O3 \ + -static -DLGUEST_GUEST_TOP="$(LGUEST_GUEST_TOP)" -Wl,-T,lguest.lds +LDLIBS:=-lz + +all: lguest.lds lguest + +# The linker script on x86 is so complex the only way of creating one +# which will link our binary in the right place is to mangle the +# default one. +lguest.lds: + $(LD) --verbose | awk '/^==========/ { PRINT=1; next; } /SIZEOF_HEADERS/ { gsub(/0x[0-9A-F]*/, "$(LGUEST_GUEST_TOP)") } { if (PRINT) print $$0; }' > $@ + +clean: + rm -f lguest.lds lguest diff --git a/Documentation/lguest/lguest.c b/Documentation/lguest/lguest.c new file mode 100644 index 000000000000..1432b502a2d9 --- /dev/null +++ b/Documentation/lguest/lguest.c @@ -0,0 +1,1012 @@ +/* Simple program to layout "physical" memory for new lguest guest. + * Linked high to avoid likely physical memory. */ +#define _LARGEFILE64_SOURCE +#define _GNU_SOURCE +#include <stdio.h> +#include <string.h> +#include <unistd.h> +#include <err.h> +#include <stdint.h> +#include <stdlib.h> +#include <elf.h> +#include <sys/mman.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <sys/wait.h> +#include <fcntl.h> +#include <stdbool.h> +#include <errno.h> +#include <ctype.h> +#include <sys/socket.h> +#include <sys/ioctl.h> +#include <sys/time.h> +#include <time.h> +#include <netinet/in.h> +#include <net/if.h> +#include <linux/sockios.h> +#include <linux/if_tun.h> +#include <sys/uio.h> +#include <termios.h> +#include <getopt.h> +#include <zlib.h> +typedef unsigned long long u64; +typedef uint32_t u32; +typedef uint16_t u16; +typedef uint8_t u8; +#include "../../include/linux/lguest_launcher.h" +#include "../../include/asm-i386/e820.h" + +#define PAGE_PRESENT 0x7 /* Present, RW, Execute */ +#define NET_PEERNUM 1 +#define BRIDGE_PFX "bridge:" +#ifndef SIOCBRADDIF +#define SIOCBRADDIF 0x89a2 /* add interface to bridge */ +#endif + +static bool verbose; +#define verbose(args...) \ + do { if (verbose) printf(args); } while(0) +static int waker_fd; + +struct device_list +{ + fd_set infds; + int max_infd; + + struct device *dev; + struct device **lastdev; +}; + +struct device +{ + struct device *next; + struct lguest_device_desc *desc; + void *mem; + + /* Watch this fd if handle_input non-NULL. */ + int fd; + bool (*handle_input)(int fd, struct device *me); + + /* Watch DMA to this key if handle_input non-NULL. */ + unsigned long watch_key; + u32 (*handle_output)(int fd, const struct iovec *iov, + unsigned int num, struct device *me); + + /* Device-specific data. */ + void *priv; +}; + +static int open_or_die(const char *name, int flags) +{ + int fd = open(name, flags); + if (fd < 0) + err(1, "Failed to open %s", name); + return fd; +} + +static void *map_zeroed_pages(unsigned long addr, unsigned int num) +{ + static int fd = -1; + + if (fd == -1) + fd = open_or_die("/dev/zero", O_RDONLY); + + if (mmap((void *)addr, getpagesize() * num, + PROT_READ|PROT_WRITE|PROT_EXEC, MAP_FIXED|MAP_PRIVATE, fd, 0) + != (void *)addr) + err(1, "Mmaping %u pages of /dev/zero @%p", num, (void *)addr); + return (void *)addr; +} + +/* Find magic string marking entry point, return entry point. */ +static unsigned long entry_point(void *start, void *end, + unsigned long page_offset) +{ + void *p; + + for (p = start; p < end; p++) + if (memcmp(p, "GenuineLguest", strlen("GenuineLguest")) == 0) + return (long)p + strlen("GenuineLguest") + page_offset; + + err(1, "Is this image a genuine lguest?"); +} + +/* Returns the entry point */ +static unsigned long map_elf(int elf_fd, const Elf32_Ehdr *ehdr, + unsigned long *page_offset) +{ + void *addr; + Elf32_Phdr phdr[ehdr->e_phnum]; + unsigned int i; + unsigned long start = -1UL, end = 0; + + /* Sanity checks. */ + if (ehdr->e_type != ET_EXEC + || ehdr->e_machine != EM_386 + || ehdr->e_phentsize != sizeof(Elf32_Phdr) + || ehdr->e_phnum < 1 || ehdr->e_phnum > 65536U/sizeof(Elf32_Phdr)) + errx(1, "Malformed elf header"); + + if (lseek(elf_fd, ehdr->e_phoff, SEEK_SET) < 0) + err(1, "Seeking to program headers"); + if (read(elf_fd, phdr, sizeof(phdr)) != sizeof(phdr)) + err(1, "Reading program headers"); + + *page_offset = 0; + /* We map the loadable segments at virtual addresses corresponding + * to their physical addresses (our virtual == guest physical). */ + for (i = 0; i < ehdr->e_phnum; i++) { + if (phdr[i].p_type != PT_LOAD) + continue; + + verbose("Section %i: size %i addr %p\n", + i, phdr[i].p_memsz, (void *)phdr[i].p_paddr); + + /* We expect linear address space. */ + if (!*page_offset) + *page_offset = phdr[i].p_vaddr - phdr[i].p_paddr; + else if (*page_offset != phdr[i].p_vaddr - phdr[i].p_paddr) + errx(1, "Page offset of section %i different", i); + + if (phdr[i].p_paddr < start) + start = phdr[i].p_paddr; + if (phdr[i].p_paddr + phdr[i].p_filesz > end) + end = phdr[i].p_paddr + phdr[i].p_filesz; + + /* We map everything private, writable. */ + addr = mmap((void *)phdr[i].p_paddr, + phdr[i].p_filesz, + PROT_READ|PROT_WRITE|PROT_EXEC, + MAP_FIXED|MAP_PRIVATE, + elf_fd, phdr[i].p_offset); + if (addr != (void *)phdr[i].p_paddr) + err(1, "Mmaping vmlinux seg %i gave %p not %p", + i, addr, (void *)phdr[i].p_paddr); + } + + return entry_point((void *)start, (void *)end, *page_offset); +} + +/* This is amazingly reliable. */ +static unsigned long intuit_page_offset(unsigned char *img, unsigned long len) +{ + unsigned int i, possibilities[256] = { 0 }; + + for (i = 0; i + 4 < len; i++) { + /* mov 0xXXXXXXXX,%eax */ + if (img[i] == 0xA1 && ++possibilities[img[i+4]] > 3) + return (unsigned long)img[i+4] << 24; + } + errx(1, "could not determine page offset"); +} + +static unsigned long unpack_bzimage(int fd, unsigned long *page_offset) +{ + gzFile f; + int ret, len = 0; + void *img = (void *)0x100000; + + f = gzdopen(fd, "rb"); + while ((ret = gzread(f, img + len, 65536)) > 0) + len += ret; + if (ret < 0) + err(1, "reading image from bzImage"); + + verbose("Unpacked size %i addr %p\n", len, img); + *page_offset = intuit_page_offset(img, len); + + return entry_point(img, img + len, *page_offset); +} + +static unsigned long load_bzimage(int fd, unsigned long *page_offset) +{ + unsigned char c; + int state = 0; + + /* Ugly brute force search for gzip header. */ + while (read(fd, &c, 1) == 1) { + switch (state) { + case 0: + if (c == 0x1F) + state++; + break; + case 1: + if (c == 0x8B) + state++; + else + state = 0; + break; + case 2 ... 8: + state++; + break; + case 9: + lseek(fd, -10, SEEK_CUR); + if (c != 0x03) /* Compressed under UNIX. */ + state = -1; + else + return unpack_bzimage(fd, page_offset); + } + } + errx(1, "Could not find kernel in bzImage"); +} + +static unsigned long load_kernel(int fd, unsigned long *page_offset) +{ + Elf32_Ehdr hdr; + + if (read(fd, &hdr, sizeof(hdr)) != sizeof(hdr)) + err(1, "Reading kernel"); + + if (memcmp(hdr.e_ident, ELFMAG, SELFMAG) == 0) + return map_elf(fd, &hdr, page_offset); + + return load_bzimage(fd, page_offset); +} + +static inline unsigned long page_align(unsigned long addr) +{ + return ((addr + getpagesize()-1) & ~(getpagesize()-1)); +} + +/* initrd gets loaded at top of memory: return length. */ +static unsigned long load_initrd(const char *name, unsigned long mem) +{ + int ifd; + struct stat st; + unsigned long len; + void *iaddr; + + ifd = open_or_die(name, O_RDONLY); + if (fstat(ifd, &st) < 0) + err(1, "fstat() on initrd '%s'", name); + + len = page_align(st.st_size); + iaddr = mmap((void *)mem - len, st.st_size, + PROT_READ|PROT_EXEC|PROT_WRITE, + MAP_FIXED|MAP_PRIVATE, ifd, 0); + if (iaddr != (void *)mem - len) + err(1, "Mmaping initrd '%s' returned %p not %p", + name, iaddr, (void *)mem - len); + close(ifd); + verbose("mapped initrd %s size=%lu @ %p\n", name, st.st_size, iaddr); + return len; +} + +static unsigned long setup_pagetables(unsigned long mem, + unsigned long initrd_size, + unsigned long page_offset) +{ + u32 *pgdir, *linear; + unsigned int mapped_pages, i, linear_pages; + unsigned int ptes_per_page = getpagesize()/sizeof(u32); + + /* If we can map all of memory above page_offset, we do so. */ + if (mem <= -page_offset) + mapped_pages = mem/getpagesize(); + else + mapped_pages = -page_offset/getpagesize(); + + /* Each linear PTE page can map ptes_per_page pages. */ + linear_pages = (mapped_pages + ptes_per_page-1)/ptes_per_page; + + /* We lay out top-level then linear mapping immediately below initrd */ + pgdir = (void *)mem - initrd_size - getpagesize(); + linear = (void *)pgdir - linear_pages*getpagesize(); + + for (i = 0; i < mapped_pages; i++) + linear[i] = ((i * getpagesize()) | PAGE_PRESENT); + + /* Now set up pgd so that this memory is at page_offset */ + for (i = 0; i < mapped_pages; i += ptes_per_page) { + pgdir[(i + page_offset/getpagesize())/ptes_per_page] + = (((u32)linear + i*sizeof(u32)) | PAGE_PRESENT); + } + + verbose("Linear mapping of %u pages in %u pte pages at %p\n", + mapped_pages, linear_pages, linear); + + return (unsigned long)pgdir; +} + +static void concat(char *dst, char *args[]) +{ + unsigned int i, len = 0; + + for (i = 0; args[i]; i++) { + strcpy(dst+len, args[i]); + strcat(dst+len, " "); + len += strlen(args[i]) + 1; + } + /* In case it's empty. */ + dst[len] = '\0'; +} + +static int tell_kernel(u32 pgdir, u32 start, u32 page_offset) +{ + u32 args[] = { LHREQ_INITIALIZE, + LGUEST_GUEST_TOP/getpagesize(), /* Just below us */ + pgdir, start, page_offset }; + int fd; + + fd = open_or_die("/dev/lguest", O_RDWR); + if (write(fd, args, sizeof(args)) < 0) + err(1, "Writing to /dev/lguest"); + return fd; +} + +static void set_fd(int fd, struct device_list *devices) +{ + FD_SET(fd, &devices->infds); + if (fd > devices->max_infd) + devices->max_infd = fd; +} + +/* When input arrives, we tell the kernel to kick lguest out with -EAGAIN. */ +static void wake_parent(int pipefd, int lguest_fd, struct device_list *devices) +{ + set_fd(pipefd, devices); + + for (;;) { + fd_set rfds = devices->infds; + u32 args[] = { LHREQ_BREAK, 1 }; + + select(devices->max_infd+1, &rfds, NULL, NULL, NULL); + if (FD_ISSET(pipefd, &rfds)) { + int ignorefd; + if (read(pipefd, &ignorefd, sizeof(ignorefd)) == 0) + exit(0); + FD_CLR(ignorefd, &devices->infds); + } else + write(lguest_fd, args, sizeof(args)); + } +} + +static int setup_waker(int lguest_fd, struct device_list *device_list) +{ + int pipefd[2], child; + + pipe(pipefd); + child = fork(); + if (child == -1) + err(1, "forking"); + + if (child == 0) { + close(pipefd[1]); + wake_parent(pipefd[0], lguest_fd, device_list); + } + close(pipefd[0]); + + return pipefd[1]; +} + +static void *_check_pointer(unsigned long addr, unsigned int size, + unsigned int line) +{ + if (addr >= LGUEST_GUEST_TOP || addr + size >= LGUEST_GUEST_TOP) + errx(1, "%s:%i: Invalid address %li", __FILE__, line, addr); + return (void *)addr; +} +#define check_pointer(addr,size) _check_pointer(addr, size, __LINE__) + +/* Returns pointer to dma->used_len */ +static u32 *dma2iov(unsigned long dma, struct iovec iov[], unsigned *num) +{ + unsigned int i; + struct lguest_dma *udma; + + udma = check_pointer(dma, sizeof(*udma)); + for (i = 0; i < LGUEST_MAX_DMA_SECTIONS; i++) { + if (!udma->len[i]) + break; + + iov[i].iov_base = check_pointer(udma->addr[i], udma->len[i]); + iov[i].iov_len = udma->len[i]; + } + *num = i; + return &udma->used_len; +} + +static u32 *get_dma_buffer(int fd, void *key, + struct iovec iov[], unsigned int *num, u32 *irq) +{ + u32 buf[] = { LHREQ_GETDMA, (u32)key }; + unsigned long udma; + u32 *res; + + udma = write(fd, buf, sizeof(buf)); + if (udma == (unsigned long)-1) + return NULL; + + /* Kernel stashes irq in ->used_len. */ + res = dma2iov(udma, iov, num); + *irq = *res; + return res; +} + +static void trigger_irq(int fd, u32 irq) +{ + u32 buf[] = { LHREQ_IRQ, irq }; + if (write(fd, buf, sizeof(buf)) != 0) + err(1, "Triggering irq %i", irq); +} + +static void discard_iovec(struct iovec *iov, unsigned int *num) +{ + static char discard_buf[1024]; + *num = 1; + iov->iov_base = discard_buf; + iov->iov_len = sizeof(discard_buf); +} + +static struct termios orig_term; +static void restore_term(void) +{ + tcsetattr(STDIN_FILENO, TCSANOW, &orig_term); +} + +struct console_abort +{ + int count; + struct timeval start; +}; + +/* We DMA input to buffer bound at start of console page. */ +static bool handle_console_input(int fd, struct device *dev) +{ + u32 irq = 0, *lenp; + int len; + unsigned int num; + struct iovec iov[LGUEST_MAX_DMA_SECTIONS]; + struct console_abort *abort = dev->priv; + + lenp = get_dma_buffer(fd, dev->mem, iov, &num, &irq); + if (!lenp) { + warn("console: no dma buffer!"); + discard_iovec(iov, &num); + } + + len = readv(dev->fd, iov, num); + if (len <= 0) { + warnx("Failed to get console input, ignoring console."); + len = 0; + } + + if (lenp) { + *lenp = len; + trigger_irq(fd, irq); + } + + /* Three ^C within one second? Exit. */ + if (len == 1 && ((char *)iov[0].iov_base)[0] == 3) { + if (!abort->count++) + gettimeofday(&abort->start, NULL); + else if (abort->count == 3) { + struct timeval now; + gettimeofday(&now, NULL); + if (now.tv_sec <= abort->start.tv_sec+1) { + /* Make sure waker is not blocked in BREAK */ + u32 args[] = { LHREQ_BREAK, 0 }; + close(waker_fd); + write(fd, args, sizeof(args)); + exit(2); + } + abort->count = 0; + } + } else + abort->count = 0; + + if (!len) { + restore_term(); + return false; + } + return true; +} + +static u32 handle_console_output(int fd, const struct iovec *iov, + unsigned num, struct device*dev) +{ + return writev(STDOUT_FILENO, iov, num); +} + +static u32 handle_tun_output(int fd, const struct iovec *iov, + unsigned num, struct device *dev) +{ + /* Now we've seen output, we should warn if we can't get buffers. */ + *(bool *)dev->priv = true; + return writev(dev->fd, iov, num); +} + +static unsigned long peer_offset(unsigned int peernum) +{ + return 4 * peernum; +} + +static bool handle_tun_input(int fd, struct device *dev) +{ + u32 irq = 0, *lenp; + int len; + unsigned num; + struct iovec iov[LGUEST_MAX_DMA_SECTIONS]; + + lenp = get_dma_buffer(fd, dev->mem+peer_offset(NET_PEERNUM), iov, &num, + &irq); + if (!lenp) { + if (*(bool *)dev->priv) + warn("network: no dma buffer!"); + discard_iovec(iov, &num); + } + + len = readv(dev->fd, iov, num); + if (len <= 0) + err(1, "reading network"); + if (lenp) { + *lenp = len; + trigger_irq(fd, irq); + } + verbose("tun input packet len %i [%02x %02x] (%s)\n", len, + ((u8 *)iov[0].iov_base)[0], ((u8 *)iov[0].iov_base)[1], + lenp ? "sent" : "discarded"); + return true; +} + +static u32 handle_block_output(int fd, const struct iovec *iov, + unsigned num, struct device *dev) +{ + struct lguest_block_page *p = dev->mem; + u32 irq, *lenp; + unsigned int len, reply_num; + struct iovec reply[LGUEST_MAX_DMA_SECTIONS]; + off64_t device_len, off = (off64_t)p->sector * 512; + + device_len = *(off64_t *)dev->priv; + + if (off >= device_len) + err(1, "Bad offset %llu vs %llu", off, device_len); + if (lseek64(dev->fd, off, SEEK_SET) != off) + err(1, "Bad seek to sector %i", p->sector); + + verbose("Block: %s at offset %llu\n", p->type ? "WRITE" : "READ", off); + + lenp = get_dma_buffer(fd, dev->mem, reply, &reply_num, &irq); + if (!lenp) + err(1, "Block request didn't give us a dma buffer"); + + if (p->type) { + len = writev(dev->fd, iov, num); + if (off + len > device_len) { + ftruncate(dev->fd, device_len); + errx(1, "Write past end %llu+%u", off, len); + } + *lenp = 0; + } else { + len = readv(dev->fd, reply, reply_num); + *lenp = len; + } + + p->result = 1 + (p->bytes != len); + trigger_irq(fd, irq); + return 0; +} + +static void handle_output(int fd, unsigned long dma, unsigned long key, + struct device_list *devices) +{ + struct device *i; + u32 *lenp; + struct iovec iov[LGUEST_MAX_DMA_SECTIONS]; + unsigned num = 0; + + lenp = dma2iov(dma, iov, &num); + for (i = devices->dev; i; i = i->next) { + if (i->handle_output && key == i->watch_key) { + *lenp = i->handle_output(fd, iov, num, i); + return; + } + } + warnx("Pending dma %p, key %p", (void *)dma, (void *)key); +} + +static void handle_input(int fd, struct device_list *devices) +{ + struct timeval poll = { .tv_sec = 0, .tv_usec = 0 }; + + for (;;) { + struct device *i; + fd_set fds = devices->infds; + + if (select(devices->max_infd+1, &fds, NULL, NULL, &poll) == 0) + break; + + for (i = devices->dev; i; i = i->next) { + if (i->handle_input && FD_ISSET(i->fd, &fds)) { + if (!i->handle_input(fd, i)) { + FD_CLR(i->fd, &devices->infds); + /* Tell waker to ignore it too... */ + write(waker_fd, &i->fd, sizeof(i->fd)); + } + } + } + } +} + +static struct lguest_device_desc *new_dev_desc(u16 type, u16 features, + u16 num_pages) +{ + static unsigned long top = LGUEST_GUEST_TOP; + struct lguest_device_desc *desc; + + desc = malloc(sizeof(*desc)); + desc->type = type; + desc->num_pages = num_pages; + desc->features = features; + desc->status = 0; + if (num_pages) { + top -= num_pages*getpagesize(); + map_zeroed_pages(top, num_pages); + desc->pfn = top / getpagesize(); + } else + desc->pfn = 0; + return desc; +} + +static struct device *new_device(struct device_list *devices, + u16 type, u16 num_pages, u16 features, + int fd, + bool (*handle_input)(int, struct device *), + unsigned long watch_off, + u32 (*handle_output)(int, + const struct iovec *, + unsigned, + struct device *)) +{ + struct device *dev = malloc(sizeof(*dev)); + + /* Append to device list. */ + *devices->lastdev = dev; + dev->next = NULL; + devices->lastdev = &dev->next; + + dev->fd = fd; + if (handle_input) + set_fd(dev->fd, devices); + dev->desc = new_dev_desc(type, features, num_pages); + dev->mem = (void *)(dev->desc->pfn * getpagesize()); + dev->handle_input = handle_input; + dev->watch_key = (unsigned long)dev->mem + watch_off; + dev->handle_output = handle_output; + return dev; +} + +static void setup_console(struct device_list *devices) +{ + struct device *dev; + + if (tcgetattr(STDIN_FILENO, &orig_term) == 0) { + struct termios term = orig_term; + term.c_lflag &= ~(ISIG|ICANON|ECHO); + tcsetattr(STDIN_FILENO, TCSANOW, &term); + atexit(restore_term); + } + + /* We don't currently require a page for the console. */ + dev = new_device(devices, LGUEST_DEVICE_T_CONSOLE, 0, 0, + STDIN_FILENO, handle_console_input, + LGUEST_CONSOLE_DMA_KEY, handle_console_output); + dev->priv = malloc(sizeof(struct console_abort)); + ((struct console_abort *)dev->priv)->count = 0; + verbose("device %p: console\n", + (void *)(dev->desc->pfn * getpagesize())); +} + +static void setup_block_file(const char *filename, struct device_list *devices) +{ + int fd; + struct device *dev; + off64_t *device_len; + struct lguest_block_page *p; + + fd = open_or_die(filename, O_RDWR|O_LARGEFILE|O_DIRECT); + dev = new_device(devices, LGUEST_DEVICE_T_BLOCK, 1, + LGUEST_DEVICE_F_RANDOMNESS, + fd, NULL, 0, handle_block_output); + device_len = dev->priv = malloc(sizeof(*device_len)); + *device_len = lseek64(fd, 0, SEEK_END); + p = dev->mem; + + p->num_sectors = *device_len/512; + verbose("device %p: block %i sectors\n", + (void *)(dev->desc->pfn * getpagesize()), p->num_sectors); +} + +/* We use fnctl locks to reserve network slots (autocleanup!) */ +static unsigned int find_slot(int netfd, const char *filename) +{ + struct flock fl; + + fl.l_type = F_WRLCK; + fl.l_whence = SEEK_SET; + fl.l_len = 1; + for (fl.l_start = 0; + fl.l_start < getpagesize()/sizeof(struct lguest_net); + fl.l_start++) { + if (fcntl(netfd, F_SETLK, &fl) == 0) + return fl.l_start; + } + errx(1, "No free slots in network file %s", filename); +} + +static void setup_net_file(const char *filename, + struct device_list *devices) +{ + int netfd; + struct device *dev; + + netfd = open(filename, O_RDWR, 0); + if (netfd < 0) { + if (errno == ENOENT) { + netfd = open(filename, O_RDWR|O_CREAT, 0600); + if (netfd >= 0) { + char page[getpagesize()]; + memset(page, 0, sizeof(page)); + write(netfd, page, sizeof(page)); + } + } + if (netfd < 0) + err(1, "cannot open net file '%s'", filename); + } + + dev = new_device(devices, LGUEST_DEVICE_T_NET, 1, + find_slot(netfd, filename)|LGUEST_NET_F_NOCSUM, + -1, NULL, 0, NULL); + + /* We overwrite the /dev/zero mapping with the actual file. */ + if (mmap(dev->mem, getpagesize(), PROT_READ|PROT_WRITE, + MAP_FIXED|MAP_SHARED, netfd, 0) != dev->mem) + err(1, "could not mmap '%s'", filename); + verbose("device %p: shared net %s, peer %i\n", + (void *)(dev->desc->pfn * getpagesize()), filename, + dev->desc->features & ~LGUEST_NET_F_NOCSUM); +} + +static u32 str2ip(const char *ipaddr) +{ + unsigned int byte[4]; + + sscanf(ipaddr, "%u.%u.%u.%u", &byte[0], &byte[1], &byte[2], &byte[3]); + return (byte[0] << 24) | (byte[1] << 16) | (byte[2] << 8) | byte[3]; +} + +/* adapted from libbridge */ +static void add_to_bridge(int fd, const char *if_name, const char *br_name) +{ + int ifidx; + struct ifreq ifr; + + if (!*br_name) + errx(1, "must specify bridge name"); + + ifidx = if_nametoindex(if_name); + if (!ifidx) + errx(1, "interface %s does not exist!", if_name); + + strncpy(ifr.ifr_name, br_name, IFNAMSIZ); + ifr.ifr_ifindex = ifidx; + if (ioctl(fd, SIOCBRADDIF, &ifr) < 0) + err(1, "can't add %s to bridge %s", if_name, br_name); +} + +static void configure_device(int fd, const char *devname, u32 ipaddr, + unsigned char hwaddr[6]) +{ + struct ifreq ifr; + struct sockaddr_in *sin = (struct sockaddr_in *)&ifr.ifr_addr; + + memset(&ifr, 0, sizeof(ifr)); + strcpy(ifr.ifr_name, devname); + sin->sin_family = AF_INET; + sin->sin_addr.s_addr = htonl(ipaddr); + if (ioctl(fd, SIOCSIFADDR, &ifr) != 0) + err(1, "Setting %s interface address", devname); + ifr.ifr_flags = IFF_UP; + if (ioctl(fd, SIOCSIFFLAGS, &ifr) != 0) + err(1, "Bringing interface %s up", devname); + + if (ioctl(fd, SIOCGIFHWADDR, &ifr) != 0) + err(1, "getting hw address for %s", devname); + + memcpy(hwaddr, ifr.ifr_hwaddr.sa_data, 6); +} + +static void setup_tun_net(const char *arg, struct device_list *devices) +{ + struct device *dev; + struct ifreq ifr; + int netfd, ipfd; + u32 ip; + const char *br_name = NULL; + + netfd = open_or_die("/dev/net/tun", O_RDWR); + memset(&ifr, 0, sizeof(ifr)); + ifr.ifr_flags = IFF_TAP | IFF_NO_PI; + strcpy(ifr.ifr_name, "tap%d"); + if (ioctl(netfd, TUNSETIFF, &ifr) != 0) + err(1, "configuring /dev/net/tun"); + ioctl(netfd, TUNSETNOCSUM, 1); + + /* You will be peer 1: we should create enough jitter to randomize */ + dev = new_device(devices, LGUEST_DEVICE_T_NET, 1, + NET_PEERNUM|LGUEST_DEVICE_F_RANDOMNESS, netfd, + handle_tun_input, peer_offset(0), handle_tun_output); + dev->priv = malloc(sizeof(bool)); + *(bool *)dev->priv = false; + + ipfd = socket(PF_INET, SOCK_DGRAM, IPPROTO_IP); + if (ipfd < 0) + err(1, "opening IP socket"); + + if (!strncmp(BRIDGE_PFX, arg, strlen(BRIDGE_PFX))) { + ip = INADDR_ANY; + br_name = arg + strlen(BRIDGE_PFX); + add_to_bridge(ipfd, ifr.ifr_name, br_name); + } else + ip = str2ip(arg); + + /* We are peer 0, ie. first slot. */ + configure_device(ipfd, ifr.ifr_name, ip, dev->mem); + + /* Set "promisc" bit: we want every single packet. */ + *((u8 *)dev->mem) |= 0x1; + + close(ipfd); + + verbose("device %p: tun net %u.%u.%u.%u\n", + (void *)(dev->desc->pfn * getpagesize()), + (u8)(ip>>24), (u8)(ip>>16), (u8)(ip>>8), (u8)ip); + if (br_name) + verbose("attached to bridge: %s\n", br_name); +} + +/* Now we know how much memory we have, we copy in device descriptors */ +static void map_device_descriptors(struct device_list *devs, unsigned long mem) +{ + struct device *i; + unsigned int num; + struct lguest_device_desc *descs; + + /* Device descriptor array sits just above top of normal memory */ + descs = map_zeroed_pages(mem, 1); + + for (i = devs->dev, num = 0; i; i = i->next, num++) { + if (num == LGUEST_MAX_DEVICES) + errx(1, "too many devices"); + verbose("Device %i: %s\n", num, + i->desc->type == LGUEST_DEVICE_T_NET ? "net" + : i->desc->type == LGUEST_DEVICE_T_CONSOLE ? "console" + : i->desc->type == LGUEST_DEVICE_T_BLOCK ? "block" + : "unknown"); + descs[num] = *i->desc; + free(i->desc); + i->desc = &descs[num]; + } +} + +static void __attribute__((noreturn)) +run_guest(int lguest_fd, struct device_list *device_list) +{ + for (;;) { + u32 args[] = { LHREQ_BREAK, 0 }; + unsigned long arr[2]; + int readval; + + /* We read from the /dev/lguest device to run the Guest. */ + readval = read(lguest_fd, arr, sizeof(arr)); + + if (readval == sizeof(arr)) { + handle_output(lguest_fd, arr[0], arr[1], device_list); + continue; + } else if (errno == ENOENT) { + char reason[1024] = { 0 }; + read(lguest_fd, reason, sizeof(reason)-1); + errx(1, "%s", reason); + } else if (errno != EAGAIN) + err(1, "Running guest failed"); + handle_input(lguest_fd, device_list); + if (write(lguest_fd, args, sizeof(args)) < 0) + err(1, "Resetting break"); + } +} + +static struct option opts[] = { + { "verbose", 0, NULL, 'v' }, + { "sharenet", 1, NULL, 's' }, + { "tunnet", 1, NULL, 't' }, + { "block", 1, NULL, 'b' }, + { "initrd", 1, NULL, 'i' }, + { NULL }, +}; +static void usage(void) +{ + errx(1, "Usage: lguest [--verbose] " + "[--sharenet=<filename>|--tunnet=(<ipaddr>|bridge:<bridgename>)\n" + "|--block=<filename>|--initrd=<filename>]...\n" + "<mem-in-mb> vmlinux [args...]"); +} + +int main(int argc, char *argv[]) +{ + unsigned long mem, pgdir, start, page_offset, initrd_size = 0; + int c, lguest_fd; + struct device_list device_list; + void *boot = (void *)0; + const char *initrd_name = NULL; + + device_list.max_infd = -1; + device_list.dev = NULL; + device_list.lastdev = &device_list.dev; + FD_ZERO(&device_list.infds); + + while ((c = getopt_long(argc, argv, "v", opts, NULL)) != EOF) { + switch (c) { + case 'v': + verbose = true; + break; + case 's': + setup_net_file(optarg, &device_list); + break; + case 't': + setup_tun_net(optarg, &device_list); + break; + case 'b': + setup_block_file(optarg, &device_list); + break; + case 'i': + initrd_name = optarg; + break; + default: + warnx("Unknown argument %s", argv[optind]); + usage(); + } + } + if (optind + 2 > argc) + usage(); + + /* We need a console device */ + setup_console(&device_list); + + /* First we map /dev/zero over all of guest-physical memory. */ + mem = atoi(argv[optind]) * 1024 * 1024; + map_zeroed_pages(0, mem / getpagesize()); + + /* Now we load the kernel */ + start = load_kernel(open_or_die(argv[optind+1], O_RDONLY), + &page_offset); + + /* Write the device descriptors into memory. */ + map_device_descriptors(&device_list, mem); + + /* Map the initrd image if requested */ + if (initrd_name) { + initrd_size = load_initrd(initrd_name, mem); + *(unsigned long *)(boot+0x218) = mem - initrd_size; + *(unsigned long *)(boot+0x21c) = initrd_size; + *(unsigned char *)(boot+0x210) = 0xFF; + } + + /* Set up the initial linar pagetables. */ + pgdir = setup_pagetables(mem, initrd_size, page_offset); + + /* E820 memory map: ours is a simple, single region. */ + *(char*)(boot+E820NR) = 1; + *((struct e820entry *)(boot+E820MAP)) + = ((struct e820entry) { 0, mem, E820_RAM }); + /* Command line pointer and command line (at 4096) */ + *(void **)(boot + 0x228) = boot + 4096; + concat(boot + 4096, argv+optind+2); + /* Paravirt type: 1 == lguest */ + *(int *)(boot + 0x23c) = 1; + + lguest_fd = tell_kernel(pgdir, start, page_offset); + waker_fd = setup_waker(lguest_fd, &device_list); + + run_guest(lguest_fd, &device_list); +} diff --git a/Documentation/lguest/lguest.txt b/Documentation/lguest/lguest.txt new file mode 100644 index 000000000000..821617bd6c04 --- /dev/null +++ b/Documentation/lguest/lguest.txt @@ -0,0 +1,129 @@ +Rusty's Remarkably Unreliable Guide to Lguest + - or, A Young Coder's Illustrated Hypervisor +http://lguest.ozlabs.org + +Lguest is designed to be a minimal hypervisor for the Linux kernel, for +Linux developers and users to experiment with virtualization with the +minimum of complexity. Nonetheless, it should have sufficient +features to make it useful for specific tasks, and, of course, you are +encouraged to fork and enhance it. + +Features: + +- Kernel module which runs in a normal kernel. +- Simple I/O model for communication. +- Simple program to create new guests. +- Logo contains cute puppies: http://lguest.ozlabs.org + +Developer features: + +- Fun to hack on. +- No ABI: being tied to a specific kernel anyway, you can change anything. +- Many opportunities for improvement or feature implementation. + +Running Lguest: + +- Lguest runs the same kernel as guest and host. You can configure + them differently, but usually it's easiest not to. + + You will need to configure your kernel with the following options: + + CONFIG_HIGHMEM64G=n ("High Memory Support" "64GB")[1] + CONFIG_TUN=y/m ("Universal TUN/TAP device driver support") + CONFIG_EXPERIMENTAL=y ("Prompt for development and/or incomplete code/drivers") + CONFIG_PARAVIRT=y ("Paravirtualization support (EXPERIMENTAL)") + CONFIG_LGUEST=y/m ("Linux hypervisor example code") + + and I recommend: + CONFIG_HZ=100 ("Timer frequency")[2] + +- A tool called "lguest" is available in this directory: type "make" + to build it. If you didn't build your kernel in-tree, use "make + O=<builddir>". + +- Create or find a root disk image. There are several useful ones + around, such as the xm-test tiny root image at + http://xm-test.xensource.com/ramdisks/initrd-1.1-i386.img + + For more serious work, I usually use a distribution ISO image and + install it under qemu, then make multiple copies: + + dd if=/dev/zero of=rootfile bs=1M count=2048 + qemu -cdrom image.iso -hda rootfile -net user -net nic -boot d + +- "modprobe lg" if you built it as a module. + +- Run an lguest as root: + + Documentation/lguest/lguest 64m vmlinux --tunnet=192.168.19.1 --block=rootfile root=/dev/lgba + + Explanation: + 64m: the amount of memory to use. + + vmlinux: the kernel image found in the top of your build directory. You + can also use a standard bzImage. + + --tunnet=192.168.19.1: configures a "tap" device for networking with this + IP address. + + --block=rootfile: a file or block device which becomes /dev/lgba + inside the guest. + + root=/dev/lgba: this (and anything else on the command line) are + kernel boot parameters. + +- Configuring networking. I usually have the host masquerade, using + "iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE" and "echo 1 > + /proc/sys/net/ipv4/ip_forward". In this example, I would configure + eth0 inside the guest at 192.168.19.2. + + Another method is to bridge the tap device to an external interface + using --tunnet=bridge:<bridgename>, and perhaps run dhcp on the guest + to obtain an IP address. The bridge needs to be configured first: + this option simply adds the tap interface to it. + + A simple example on my system: + + ifconfig eth0 0.0.0.0 + brctl addbr lg0 + ifconfig lg0 up + brctl addif lg0 eth0 + dhclient lg0 + + Then use --tunnet=bridge:lg0 when launching the guest. + + See http://linux-net.osdl.org/index.php/Bridge for general information + on how to get bridging working. + +- You can also create an inter-guest network using + "--sharenet=<filename>": any two guests using the same file are on + the same network. This file is created if it does not exist. + +Lguest I/O model: + +Lguest uses a simplified DMA model plus shared memory for I/O. Guests +can communicate with each other if they share underlying memory +(usually by the lguest program mmaping the same file), but they can +use any non-shared memory to communicate with the lguest process. + +Guests can register DMA buffers at any key (must be a valid physical +address) using the LHCALL_BIND_DMA(key, dmabufs, num<<8|irq) +hypercall. "dmabufs" is the physical address of an array of "num" +"struct lguest_dma": each contains a used_len, and an array of +physical addresses and lengths. When a transfer occurs, the +"used_len" field of one of the buffers which has used_len 0 will be +set to the length transferred and the irq will fire. + +Using an irq value of 0 unbinds the dma buffers. + +To send DMA, the LHCALL_SEND_DMA(key, dma_physaddr) hypercall is used, +and the bytes used is written to the used_len field. This can be 0 if +noone else has bound a DMA buffer to that key or some other error. +DMA buffers bound by the same guest are ignored. + +Cheers! +Rusty Russell rusty@rustcorp.com.au. + +[1] These are on various places on the TODO list, waiting for you to + get annoyed enough at the limitation to fix it. +[2] Lguest is not yet tickless when idle. See [1]. diff --git a/Documentation/m68k/kernel-options.txt b/Documentation/m68k/kernel-options.txt index 1c41db21d3c1..59108cebe163 100644 --- a/Documentation/m68k/kernel-options.txt +++ b/Documentation/m68k/kernel-options.txt @@ -82,13 +82,6 @@ Valid names are: /dev/fd : -> 0x0200 (floppy disk) /dev/xda: -> 0x0c00 (first XT disk, unused in Linux/m68k) /dev/xdb: -> 0x0c40 (second XT disk, unused in Linux/m68k) - /dev/ada: -> 0x1c00 (first ACSI device) - /dev/adb: -> 0x1c10 (second ACSI device) - /dev/adc: -> 0x1c20 (third ACSI device) - /dev/add: -> 0x1c30 (forth ACSI device) - -The last four names are available only if the kernel has been compiled -with Atari and ACSI support. The name must be followed by a decimal number, that stands for the partition number. Internally, the value of the number is just diff --git a/Documentation/networking/00-INDEX b/Documentation/networking/00-INDEX index e06b6e3c1db5..d63f480afb74 100644 --- a/Documentation/networking/00-INDEX +++ b/Documentation/networking/00-INDEX @@ -32,6 +32,8 @@ cops.txt - info on the COPS LocalTalk Linux driver cs89x0.txt - the Crystal LAN (CS8900/20-based) Ethernet ISA adapter driver +cxacru.txt + - Conexant AccessRunner USB ADSL Modem de4x5.txt - the Digital EtherWORKS DE4?? and DE5?? PCI Ethernet driver decnet.txt @@ -94,9 +96,6 @@ routing.txt - the new routing mechanism shaper.txt - info on the module that can shape/limit transmitted traffic. -sk98lin.txt - - Marvell Yukon Chipset / SysKonnect SK-98xx compliant Gigabit - Ethernet Adapter family driver info skfp.txt - SysKonnect FDDI (SK-5xxx, Compaq Netelligent) driver info. smc9.txt diff --git a/Documentation/networking/cxacru.txt b/Documentation/networking/cxacru.txt new file mode 100644 index 000000000000..b074681a963e --- /dev/null +++ b/Documentation/networking/cxacru.txt @@ -0,0 +1,84 @@ +Firmware is required for this device: http://accessrunner.sourceforge.net/ + +While it is capable of managing/maintaining the ADSL connection without the +module loaded, the device will sometimes stop responding after unloading the +driver and it is necessary to unplug/remove power to the device to fix this. + +Detected devices will appear as ATM devices named "cxacru". In /sys/class/atm/ +these are directories named cxacruN where N is the device number. A symlink +named device points to the USB interface device's directory which contains +several sysfs attribute files for retrieving device statistics: + +* adsl_controller_version + +* adsl_headend +* adsl_headend_environment + Information about the remote headend. + +* downstream_attenuation (dB) +* downstream_bits_per_frame +* downstream_rate (kbps) +* downstream_snr_margin (dB) + Downstream stats. + +* upstream_attenuation (dB) +* upstream_bits_per_frame +* upstream_rate (kbps) +* upstream_snr_margin (dB) +* transmitter_power (dBm/Hz) + Upstream stats. + +* downstream_crc_errors +* downstream_fec_errors +* downstream_hec_errors +* upstream_crc_errors +* upstream_fec_errors +* upstream_hec_errors + Error counts. + +* line_startable + Indicates that ADSL support on the device + is/can be enabled, see adsl_start. + +* line_status + "initialising" + "down" + "attempting to activate" + "training" + "channel analysis" + "exchange" + "waiting" + "up" + + Changes between "down" and "attempting to activate" + if there is no signal. + +* link_status + "not connected" + "connected" + "lost" + +* mac_address + +* modulation + "ANSI T1.413" + "ITU-T G.992.1 (G.DMT)" + "ITU-T G.992.2 (G.LITE)" + +* startup_attempts + Count of total attempts to initialise ADSL. + +To enable/disable ADSL, the following can be written to the adsl_state file: + "start" + "stop + "restart" (stops, waits 1.5s, then starts) + "poll" (used to resume status polling if it was disabled due to failure) + +Changes in adsl/line state are reported via kernel log messages: + [4942145.150704] ATM dev 0: ADSL state: running + [4942243.663766] ATM dev 0: ADSL line: down + [4942249.665075] ATM dev 0: ADSL line: attempting to activate + [4942253.654954] ATM dev 0: ADSL line: training + [4942255.666387] ATM dev 0: ADSL line: channel analysis + [4942259.656262] ATM dev 0: ADSL line: exchange + [2635357.696901] ATM dev 0: ADSL line: up (8128 kb/s down | 832 kb/s up) diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index af6a63ab9026..32c2e9da5f3a 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -433,6 +433,12 @@ tcp_workaround_signed_windows - BOOLEAN not receive a window scaling option from them. Default: 0 +tcp_dma_copybreak - INTEGER + Lower limit, in bytes, of the size of socket reads that will be + offloaded to a DMA copy engine, if one is present in the system + and CONFIG_NET_DMA is enabled. + Default: 4096 + CIPSOv4 Variables: cipso_cache_enable - BOOLEAN @@ -874,8 +880,7 @@ accept_redirects - BOOLEAN accept_source_route - INTEGER Accept source routing (routing extension header). - > 0: Accept routing header. - = 0: Accept only routing header type 2. + >= 0: Accept only routing header type 2. < 0: Do not accept routing header. Default: 0 diff --git a/Documentation/networking/l2tp.txt b/Documentation/networking/l2tp.txt new file mode 100644 index 000000000000..2451f551c505 --- /dev/null +++ b/Documentation/networking/l2tp.txt @@ -0,0 +1,169 @@ +This brief document describes how to use the kernel's PPPoL2TP driver +to provide L2TP functionality. L2TP is a protocol that tunnels one or +more PPP sessions over a UDP tunnel. It is commonly used for VPNs +(L2TP/IPSec) and by ISPs to tunnel subscriber PPP sessions over an IP +network infrastructure. + +Design +====== + +The PPPoL2TP driver, drivers/net/pppol2tp.c, provides a mechanism by +which PPP frames carried through an L2TP session are passed through +the kernel's PPP subsystem. The standard PPP daemon, pppd, handles all +PPP interaction with the peer. PPP network interfaces are created for +each local PPP endpoint. + +The L2TP protocol http://www.faqs.org/rfcs/rfc2661.html defines L2TP +control and data frames. L2TP control frames carry messages between +L2TP clients/servers and are used to setup / teardown tunnels and +sessions. An L2TP client or server is implemented in userspace and +will use a regular UDP socket per tunnel. L2TP data frames carry PPP +frames, which may be PPP control or PPP data. The kernel's PPP +subsystem arranges for PPP control frames to be delivered to pppd, +while data frames are forwarded as usual. + +Each tunnel and session within a tunnel is assigned a unique tunnel_id +and session_id. These ids are carried in the L2TP header of every +control and data packet. The pppol2tp driver uses them to lookup +internal tunnel and/or session contexts. Zero tunnel / session ids are +treated specially - zero ids are never assigned to tunnels or sessions +in the network. In the driver, the tunnel context keeps a pointer to +the tunnel UDP socket. The session context keeps a pointer to the +PPPoL2TP socket, as well as other data that lets the driver interface +to the kernel PPP subsystem. + +Note that the pppol2tp kernel driver handles only L2TP data frames; +L2TP control frames are simply passed up to userspace in the UDP +tunnel socket. The kernel handles all datapath aspects of the +protocol, including data packet resequencing (if enabled). + +There are a number of requirements on the userspace L2TP daemon in +order to use the pppol2tp driver. + +1. Use a UDP socket per tunnel. + +2. Create a single PPPoL2TP socket per tunnel bound to a special null + session id. This is used only for communicating with the driver but + must remain open while the tunnel is active. Opening this tunnel + management socket causes the driver to mark the tunnel socket as an + L2TP UDP encapsulation socket and flags it for use by the + referenced tunnel id. This hooks up the UDP receive path via + udp_encap_rcv() in net/ipv4/udp.c. PPP data frames are never passed + in this special PPPoX socket. + +3. Create a PPPoL2TP socket per L2TP session. This is typically done + by starting pppd with the pppol2tp plugin and appropriate + arguments. A PPPoL2TP tunnel management socket (Step 2) must be + created before the first PPPoL2TP session socket is created. + +When creating PPPoL2TP sockets, the application provides information +to the driver about the socket in a socket connect() call. Source and +destination tunnel and session ids are provided, as well as the file +descriptor of a UDP socket. See struct pppol2tp_addr in +include/linux/if_ppp.h. Note that zero tunnel / session ids are +treated specially. When creating the per-tunnel PPPoL2TP management +socket in Step 2 above, zero source and destination session ids are +specified, which tells the driver to prepare the supplied UDP file +descriptor for use as an L2TP tunnel socket. + +Userspace may control behavior of the tunnel or session using +setsockopt and ioctl on the PPPoX socket. The following socket +options are supported:- + +DEBUG - bitmask of debug message categories. See below. +SENDSEQ - 0 => don't send packets with sequence numbers + 1 => send packets with sequence numbers +RECVSEQ - 0 => receive packet sequence numbers are optional + 1 => drop receive packets without sequence numbers +LNSMODE - 0 => act as LAC. + 1 => act as LNS. +REORDERTO - reorder timeout (in millisecs). If 0, don't try to reorder. + +Only the DEBUG option is supported by the special tunnel management +PPPoX socket. + +In addition to the standard PPP ioctls, a PPPIOCGL2TPSTATS is provided +to retrieve tunnel and session statistics from the kernel using the +PPPoX socket of the appropriate tunnel or session. + +Debugging +========= + +The driver supports a flexible debug scheme where kernel trace +messages may be optionally enabled per tunnel and per session. Care is +needed when debugging a live system since the messages are not +rate-limited and a busy system could be swamped. Userspace uses +setsockopt on the PPPoX socket to set a debug mask. + +The following debug mask bits are available: + +PPPOL2TP_MSG_DEBUG verbose debug (if compiled in) +PPPOL2TP_MSG_CONTROL userspace - kernel interface +PPPOL2TP_MSG_SEQ sequence numbers handling +PPPOL2TP_MSG_DATA data packets + +Sample Userspace Code +===================== + +1. Create tunnel management PPPoX socket + + kernel_fd = socket(AF_PPPOX, SOCK_DGRAM, PX_PROTO_OL2TP); + if (kernel_fd >= 0) { + struct sockaddr_pppol2tp sax; + struct sockaddr_in const *peer_addr; + + peer_addr = l2tp_tunnel_get_peer_addr(tunnel); + memset(&sax, 0, sizeof(sax)); + sax.sa_family = AF_PPPOX; + sax.sa_protocol = PX_PROTO_OL2TP; + sax.pppol2tp.fd = udp_fd; /* fd of tunnel UDP socket */ + sax.pppol2tp.addr.sin_addr.s_addr = peer_addr->sin_addr.s_addr; + sax.pppol2tp.addr.sin_port = peer_addr->sin_port; + sax.pppol2tp.addr.sin_family = AF_INET; + sax.pppol2tp.s_tunnel = tunnel_id; + sax.pppol2tp.s_session = 0; /* special case: mgmt socket */ + sax.pppol2tp.d_tunnel = 0; + sax.pppol2tp.d_session = 0; /* special case: mgmt socket */ + + if(connect(kernel_fd, (struct sockaddr *)&sax, sizeof(sax) ) < 0 ) { + perror("connect failed"); + result = -errno; + goto err; + } + } + +2. Create session PPPoX data socket + + struct sockaddr_pppol2tp sax; + int fd; + + /* Note, the target socket must be bound already, else it will not be ready */ + sax.sa_family = AF_PPPOX; + sax.sa_protocol = PX_PROTO_OL2TP; + sax.pppol2tp.fd = tunnel_fd; + sax.pppol2tp.addr.sin_addr.s_addr = addr->sin_addr.s_addr; + sax.pppol2tp.addr.sin_port = addr->sin_port; + sax.pppol2tp.addr.sin_family = AF_INET; + sax.pppol2tp.s_tunnel = tunnel_id; + sax.pppol2tp.s_session = session_id; + sax.pppol2tp.d_tunnel = peer_tunnel_id; + sax.pppol2tp.d_session = peer_session_id; + + /* session_fd is the fd of the session's PPPoL2TP socket. + * tunnel_fd is the fd of the tunnel UDP socket. + */ + fd = connect(session_fd, (struct sockaddr *)&sax, sizeof(sax)); + if (fd < 0 ) { + return -errno; + } + return 0; + +Miscellanous +============ + +The PPPoL2TP driver was developed as part of the OpenL2TP project by +Katalix Systems Ltd. OpenL2TP is a full-featured L2TP client / server, +designed from the ground up to have the L2TP datapath in the +kernel. The project also implemented the pppol2tp plugin for pppd +which allows pppd to use the kernel driver. Details can be found at +http://openl2tp.sourceforge.net. diff --git a/Documentation/networking/mac80211-injection.txt b/Documentation/networking/mac80211-injection.txt new file mode 100644 index 000000000000..53ef7a06f49c --- /dev/null +++ b/Documentation/networking/mac80211-injection.txt @@ -0,0 +1,59 @@ +How to use packet injection with mac80211 +========================================= + +mac80211 now allows arbitrary packets to be injected down any Monitor Mode +interface from userland. The packet you inject needs to be composed in the +following format: + + [ radiotap header ] + [ ieee80211 header ] + [ payload ] + +The radiotap format is discussed in +./Documentation/networking/radiotap-headers.txt. + +Despite 13 radiotap argument types are currently defined, most only make sense +to appear on received packets. Currently three kinds of argument are used by +the injection code, although it knows to skip any other arguments that are +present (facilitating replay of captured radiotap headers directly): + + - IEEE80211_RADIOTAP_RATE - u8 arg in 500kbps units (0x02 --> 1Mbps) + + - IEEE80211_RADIOTAP_ANTENNA - u8 arg, 0x00 = ant1, 0x01 = ant2 + + - IEEE80211_RADIOTAP_DBM_TX_POWER - u8 arg, dBm + +Here is an example valid radiotap header defining these three parameters + + 0x00, 0x00, // <-- radiotap version + 0x0b, 0x00, // <- radiotap header length + 0x04, 0x0c, 0x00, 0x00, // <-- bitmap + 0x6c, // <-- rate + 0x0c, //<-- tx power + 0x01 //<-- antenna + +The ieee80211 header follows immediately afterwards, looking for example like +this: + + 0x08, 0x01, 0x00, 0x00, + 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, + 0x13, 0x22, 0x33, 0x44, 0x55, 0x66, + 0x13, 0x22, 0x33, 0x44, 0x55, 0x66, + 0x10, 0x86 + +Then lastly there is the payload. + +After composing the packet contents, it is sent by send()-ing it to a logical +mac80211 interface that is in Monitor mode. Libpcap can also be used, +(which is easier than doing the work to bind the socket to the right +interface), along the following lines: + + ppcap = pcap_open_live(szInterfaceName, 800, 1, 20, szErrbuf); +... + r = pcap_inject(ppcap, u8aSendBuffer, nLength); + +You can also find sources for a complete inject test applet here: + +http://penumbra.warmcat.com/_twk/tiki-index.php?page=packetspammer + +Andy Green <andy@warmcat.com> diff --git a/Documentation/networking/multiqueue.txt b/Documentation/networking/multiqueue.txt new file mode 100644 index 000000000000..00b60cce2224 --- /dev/null +++ b/Documentation/networking/multiqueue.txt @@ -0,0 +1,111 @@ + + HOWTO for multiqueue network device support + =========================================== + +Section 1: Base driver requirements for implementing multiqueue support +Section 2: Qdisc support for multiqueue devices +Section 3: Brief howto using PRIO or RR for multiqueue devices + + +Intro: Kernel support for multiqueue devices +--------------------------------------------------------- + +Kernel support for multiqueue devices is only an API that is presented to the +netdevice layer for base drivers to implement. This feature is part of the +core networking stack, and all network devices will be running on the +multiqueue-aware stack. If a base driver only has one queue, then these +changes are transparent to that driver. + + +Section 1: Base driver requirements for implementing multiqueue support +----------------------------------------------------------------------- + +Base drivers are required to use the new alloc_etherdev_mq() or +alloc_netdev_mq() functions to allocate the subqueues for the device. The +underlying kernel API will take care of the allocation and deallocation of +the subqueue memory, as well as netdev configuration of where the queues +exist in memory. + +The base driver will also need to manage the queues as it does the global +netdev->queue_lock today. Therefore base drivers should use the +netif_{start|stop|wake}_subqueue() functions to manage each queue while the +device is still operational. netdev->queue_lock is still used when the device +comes online or when it's completely shut down (unregister_netdev(), etc.). + +Finally, the base driver should indicate that it is a multiqueue device. The +feature flag NETIF_F_MULTI_QUEUE should be added to the netdev->features +bitmap on device initialization. Below is an example from e1000: + +#ifdef CONFIG_E1000_MQ + if ( (adapter->hw.mac.type == e1000_82571) || + (adapter->hw.mac.type == e1000_82572) || + (adapter->hw.mac.type == e1000_80003es2lan)) + netdev->features |= NETIF_F_MULTI_QUEUE; +#endif + + +Section 2: Qdisc support for multiqueue devices +----------------------------------------------- + +Currently two qdiscs support multiqueue devices. A new round-robin qdisc, +sch_rr, and sch_prio. The qdisc is responsible for classifying the skb's to +bands and queues, and will store the queue mapping into skb->queue_mapping. +Use this field in the base driver to determine which queue to send the skb +to. + +sch_rr has been added for hardware that doesn't want scheduling policies from +software, so it's a straight round-robin qdisc. It uses the same syntax and +classification priomap that sch_prio uses, so it should be intuitive to +configure for people who've used sch_prio. + +The PRIO qdisc naturally plugs into a multiqueue device. If PRIO has been +built with NET_SCH_PRIO_MQ, then upon load, it will make sure the number of +bands requested is equal to the number of queues on the hardware. If they +are equal, it sets a one-to-one mapping up between the queues and bands. If +they're not equal, it will not load the qdisc. This is the same behavior +for RR. Once the association is made, any skb that is classified will have +skb->queue_mapping set, which will allow the driver to properly queue skb's +to multiple queues. + + +Section 3: Brief howto using PRIO and RR for multiqueue devices +--------------------------------------------------------------- + +The userspace command 'tc,' part of the iproute2 package, is used to configure +qdiscs. To add the PRIO qdisc to your network device, assuming the device is +called eth0, run the following command: + +# tc qdisc add dev eth0 root handle 1: prio bands 4 multiqueue + +This will create 4 bands, 0 being highest priority, and associate those bands +to the queues on your NIC. Assuming eth0 has 4 Tx queues, the band mapping +would look like: + +band 0 => queue 0 +band 1 => queue 1 +band 2 => queue 2 +band 3 => queue 3 + +Traffic will begin flowing through each queue if your TOS values are assigning +traffic across the various bands. For example, ssh traffic will always try to +go out band 0 based on TOS -> Linux priority conversion (realtime traffic), +so it will be sent out queue 0. ICMP traffic (pings) fall into the "normal" +traffic classification, which is band 1. Therefore pings will be send out +queue 1 on the NIC. + +Note the use of the multiqueue keyword. This is only in versions of iproute2 +that support multiqueue networking devices; if this is omitted when loading +a qdisc onto a multiqueue device, the qdisc will load and operate the same +if it were loaded onto a single-queue device (i.e. - sends all traffic to +queue 0). + +Another alternative to multiqueue band allocation can be done by using the +multiqueue option and specify 0 bands. If this is the case, the qdisc will +allocate the number of bands to equal the number of queues that the device +reports, and bring the qdisc online. + +The behavior of tc filters remains the same, where it will override TOS priority +classification. + + +Author: Peter P. Waskiewicz Jr. <peter.p.waskiewicz.jr@intel.com> diff --git a/Documentation/networking/net-modules.txt b/Documentation/networking/net-modules.txt index 0b27863f155c..98c4392dd0fd 100644 --- a/Documentation/networking/net-modules.txt +++ b/Documentation/networking/net-modules.txt @@ -146,12 +146,6 @@ at1700.c: irq = 0 (Probes ports: 0x260, 0x280, 0x2A0, 0x240, 0x340, 0x320, 0x380, 0x300) -atari_bionet.c: - Supports full autoprobing. (m68k/Atari) - -atari_pamsnet.c: - Supports full autoprobing. (m68k/Atari) - atarilance.c: Supports full autoprobing. (m68k/Atari) diff --git a/Documentation/networking/netdevices.txt b/Documentation/networking/netdevices.txt index ce1361f95243..37869295fc70 100644 --- a/Documentation/networking/netdevices.txt +++ b/Documentation/networking/netdevices.txt @@ -20,6 +20,30 @@ private data which gets freed when the network device is freed. If separately allocated data is attached to the network device (dev->priv) then it is up to the module exit handler to free that. +MTU +=== +Each network device has a Maximum Transfer Unit. The MTU does not +include any link layer protocol overhead. Upper layer protocols must +not pass a socket buffer (skb) to a device to transmit with more data +than the mtu. The MTU does not include link layer header overhead, so +for example on Ethernet if the standard MTU is 1500 bytes used, the +actual skb will contain up to 1514 bytes because of the Ethernet +header. Devices should allow for the 4 byte VLAN header as well. + +Segmentation Offload (GSO, TSO) is an exception to this rule. The +upper layer protocol may pass a large socket buffer to the device +transmit routine, and the device will break that up into separate +packets based on the current MTU. + +MTU is symmetrical and applies both to receive and transmit. A device +must be able to receive at least the maximum size packet allowed by +the MTU. A network device may use the MTU as mechanism to size receive +buffers, but the device should allow packets with VLAN header. With +standard Ethernet mtu of 1500 bytes, the device should allow up to +1518 byte packets (1500 + 14 header + 4 tag). The device may either: +drop, truncate, or pass up oversize packets, but dropping oversize +packets is preferred. + struct net_device synchronization rules ======================================= @@ -43,16 +67,17 @@ dev->get_stats: dev->hard_start_xmit: Synchronization: netif_tx_lock spinlock. + When the driver sets NETIF_F_LLTX in dev->features this will be called without holding netif_tx_lock. In this case the driver has to lock by itself when needed. It is recommended to use a try lock - for this and return -1 when the spin lock fails. + for this and return NETDEV_TX_LOCKED when the spin lock fails. The locking there should also properly protect against - set_multicast_list - Context: Process with BHs disabled or BH (timer). - Notes: netif_queue_stopped() is guaranteed false - Interrupts must be enabled when calling hard_start_xmit. - (Interrupts must also be enabled when enabling the BH handler.) + set_multicast_list. + + Context: Process with BHs disabled or BH (timer), + will be called with interrupts disabled by netconsole. + Return codes: o NETDEV_TX_OK everything ok. o NETDEV_TX_BUSY Cannot transmit packet, try later @@ -74,4 +99,5 @@ dev->poll: Synchronization: __LINK_STATE_RX_SCHED bit in dev->state. See dev_close code and comments in net/core/dev.c for more info. Context: softirq + will be called with interrupts disabled by netconsole. diff --git a/Documentation/networking/radiotap-headers.txt b/Documentation/networking/radiotap-headers.txt new file mode 100644 index 000000000000..953331c7984f --- /dev/null +++ b/Documentation/networking/radiotap-headers.txt @@ -0,0 +1,152 @@ +How to use radiotap headers +=========================== + +Pointer to the radiotap include file +------------------------------------ + +Radiotap headers are variable-length and extensible, you can get most of the +information you need to know on them from: + +./include/net/ieee80211_radiotap.h + +This document gives an overview and warns on some corner cases. + + +Structure of the header +----------------------- + +There is a fixed portion at the start which contains a u32 bitmap that defines +if the possible argument associated with that bit is present or not. So if b0 +of the it_present member of ieee80211_radiotap_header is set, it means that +the header for argument index 0 (IEEE80211_RADIOTAP_TSFT) is present in the +argument area. + + < 8-byte ieee80211_radiotap_header > + [ <possible argument bitmap extensions ... > ] + [ <argument> ... ] + +At the moment there are only 13 possible argument indexes defined, but in case +we run out of space in the u32 it_present member, it is defined that b31 set +indicates that there is another u32 bitmap following (shown as "possible +argument bitmap extensions..." above), and the start of the arguments is moved +forward 4 bytes each time. + +Note also that the it_len member __le16 is set to the total number of bytes +covered by the ieee80211_radiotap_header and any arguments following. + + +Requirements for arguments +-------------------------- + +After the fixed part of the header, the arguments follow for each argument +index whose matching bit is set in the it_present member of +ieee80211_radiotap_header. + + - the arguments are all stored little-endian! + + - the argument payload for a given argument index has a fixed size. So + IEEE80211_RADIOTAP_TSFT being present always indicates an 8-byte argument is + present. See the comments in ./include/net/ieee80211_radiotap.h for a nice + breakdown of all the argument sizes + + - the arguments must be aligned to a boundary of the argument size using + padding. So a u16 argument must start on the next u16 boundary if it isn't + already on one, a u32 must start on the next u32 boundary and so on. + + - "alignment" is relative to the start of the ieee80211_radiotap_header, ie, + the first byte of the radiotap header. The absolute alignment of that first + byte isn't defined. So even if the whole radiotap header is starting at, eg, + address 0x00000003, still the first byte of the radiotap header is treated as + 0 for alignment purposes. + + - the above point that there may be no absolute alignment for multibyte + entities in the fixed radiotap header or the argument region means that you + have to take special evasive action when trying to access these multibyte + entities. Some arches like Blackfin cannot deal with an attempt to + dereference, eg, a u16 pointer that is pointing to an odd address. Instead + you have to use a kernel API get_unaligned() to dereference the pointer, + which will do it bytewise on the arches that require that. + + - The arguments for a given argument index can be a compound of multiple types + together. For example IEEE80211_RADIOTAP_CHANNEL has an argument payload + consisting of two u16s of total length 4. When this happens, the padding + rule is applied dealing with a u16, NOT dealing with a 4-byte single entity. + + +Example valid radiotap header +----------------------------- + + 0x00, 0x00, // <-- radiotap version + pad byte + 0x0b, 0x00, // <- radiotap header length + 0x04, 0x0c, 0x00, 0x00, // <-- bitmap + 0x6c, // <-- rate (in 500kHz units) + 0x0c, //<-- tx power + 0x01 //<-- antenna + + +Using the Radiotap Parser +------------------------- + +If you are having to parse a radiotap struct, you can radically simplify the +job by using the radiotap parser that lives in net/wireless/radiotap.c and has +its prototypes available in include/net/cfg80211.h. You use it like this: + +#include <net/cfg80211.h> + +/* buf points to the start of the radiotap header part */ + +int MyFunction(u8 * buf, int buflen) +{ + int pkt_rate_100kHz = 0, antenna = 0, pwr = 0; + struct ieee80211_radiotap_iterator iterator; + int ret = ieee80211_radiotap_iterator_init(&iterator, buf, buflen); + + while (!ret) { + + ret = ieee80211_radiotap_iterator_next(&iterator); + + if (ret) + continue; + + /* see if this argument is something we can use */ + + switch (iterator.this_arg_index) { + /* + * You must take care when dereferencing iterator.this_arg + * for multibyte types... the pointer is not aligned. Use + * get_unaligned((type *)iterator.this_arg) to dereference + * iterator.this_arg for type "type" safely on all arches. + */ + case IEEE80211_RADIOTAP_RATE: + /* radiotap "rate" u8 is in + * 500kbps units, eg, 0x02=1Mbps + */ + pkt_rate_100kHz = (*iterator.this_arg) * 5; + break; + + case IEEE80211_RADIOTAP_ANTENNA: + /* radiotap uses 0 for 1st ant */ + antenna = *iterator.this_arg); + break; + + case IEEE80211_RADIOTAP_DBM_TX_POWER: + pwr = *iterator.this_arg; + break; + + default: + break; + } + } /* while more rt headers */ + + if (ret != -ENOENT) + return TXRX_DROP; + + /* discard the radiotap header part */ + buf += iterator.max_length; + buflen -= iterator.max_length; + + ... + +} + +Andy Green <andy@warmcat.com> diff --git a/Documentation/networking/sk98lin.txt b/Documentation/networking/sk98lin.txt deleted file mode 100644 index 8590a954df1d..000000000000 --- a/Documentation/networking/sk98lin.txt +++ /dev/null @@ -1,568 +0,0 @@ -(C)Copyright 1999-2004 Marvell(R). -All rights reserved -=========================================================================== - -sk98lin.txt created 13-Feb-2004 - -Readme File for sk98lin v6.23 -Marvell Yukon/SysKonnect SK-98xx Gigabit Ethernet Adapter family driver for LINUX - -This file contains - 1 Overview - 2 Required Files - 3 Installation - 3.1 Driver Installation - 3.2 Inclusion of adapter at system start - 4 Driver Parameters - 4.1 Per-Port Parameters - 4.2 Adapter Parameters - 5 Large Frame Support - 6 VLAN and Link Aggregation Support (IEEE 802.1, 802.1q, 802.3ad) - 7 Troubleshooting - -=========================================================================== - - -1 Overview -=========== - -The sk98lin driver supports the Marvell Yukon and SysKonnect -SK-98xx/SK-95xx compliant Gigabit Ethernet Adapter on Linux. It has -been tested with Linux on Intel/x86 machines. -*** - - -2 Required Files -================= - -The linux kernel source. -No additional files required. -*** - - -3 Installation -=============== - -It is recommended to download the latest version of the driver from the -SysKonnect web site www.syskonnect.com. If you have downloaded the latest -driver, the Linux kernel has to be patched before the driver can be -installed. For details on how to patch a Linux kernel, refer to the -patch.txt file. - -3.1 Driver Installation ------------------------- - -The following steps describe the actions that are required to install -the driver and to start it manually. These steps should be carried -out for the initial driver setup. Once confirmed to be ok, they can -be included in the system start. - -NOTE 1: To perform the following tasks you need 'root' access. - -NOTE 2: In case of problems, please read the section "Troubleshooting" - below. - -The driver can either be integrated into the kernel or it can be compiled -as a module. Select the appropriate option during the kernel -configuration. - -Compile/use the driver as a module ----------------------------------- -To compile the driver, go to the directory /usr/src/linux and -execute the command "make menuconfig" or "make xconfig" and proceed as -follows: - -To integrate the driver permanently into the kernel, proceed as follows: - -1. Select the menu "Network device support" and then "Ethernet(1000Mbit)" -2. Mark "Marvell Yukon Chipset / SysKonnect SK-98xx family support" - with (*) -3. Build a new kernel when the configuration of the above options is - finished. -4. Install the new kernel. -5. Reboot your system. - -To use the driver as a module, proceed as follows: - -1. Enable 'loadable module support' in the kernel. -2. For automatic driver start, enable the 'Kernel module loader'. -3. Select the menu "Network device support" and then "Ethernet(1000Mbit)" -4. Mark "Marvell Yukon Chipset / SysKonnect SK-98xx family support" - with (M) -5. Execute the command "make modules". -6. Execute the command "make modules_install". - The appropriate modules will be installed. -7. Reboot your system. - - -Load the module manually ------------------------- -To load the module manually, proceed as follows: - -1. Enter "modprobe sk98lin". -2. If a Marvell Yukon or SysKonnect SK-98xx adapter is installed in - your computer and you have a /proc file system, execute the command: - "ls /proc/net/sk98lin/" - This should produce an output containing a line with the following - format: - eth0 eth1 ... - which indicates that your adapter has been found and initialized. - - NOTE 1: If you have more than one Marvell Yukon or SysKonnect SK-98xx - adapter installed, the adapters will be listed as 'eth0', - 'eth1', 'eth2', etc. - For each adapter, repeat steps 3 and 4 below. - - NOTE 2: If you have other Ethernet adapters installed, your Marvell - Yukon or SysKonnect SK-98xx adapter will be mapped to the - next available number, e.g. 'eth1'. The mapping is executed - automatically. - The module installation message (displayed either in a system - log file or on the console) prints a line for each adapter - found containing the corresponding 'ethX'. - -3. Select an IP address and assign it to the respective adapter by - entering: - ifconfig eth0 <ip-address> - With this command, the adapter is connected to the Ethernet. - - SK-98xx Gigabit Ethernet Server Adapters: The yellow LED on the adapter - is now active, the link status LED of the primary port is active and - the link status LED of the secondary port (on dual port adapters) is - blinking (if the ports are connected to a switch or hub). - SK-98xx V2.0 Gigabit Ethernet Adapters: The link status LED is active. - In addition, you will receive a status message on the console stating - "ethX: network connection up using port Y" and showing the selected - connection parameters (x stands for the ethernet device number - (0,1,2, etc), y stands for the port name (A or B)). - - NOTE: If you are in doubt about IP addresses, ask your network - administrator for assistance. - -4. Your adapter should now be fully operational. - Use 'ping <otherstation>' to verify the connection to other computers - on your network. -5. To check the adapter configuration view /proc/net/sk98lin/[devicename]. - For example by executing: - "cat /proc/net/sk98lin/eth0" - -Unload the module ------------------ -To stop and unload the driver modules, proceed as follows: - -1. Execute the command "ifconfig eth0 down". -2. Execute the command "rmmod sk98lin". - -3.2 Inclusion of adapter at system start ------------------------------------------ - -Since a large number of different Linux distributions are -available, we are unable to describe a general installation procedure -for the driver module. -Because the driver is now integrated in the kernel, installation should -be easy, using the standard mechanism of your distribution. -Refer to the distribution's manual for installation of ethernet adapters. - -*** - -4 Driver Parameters -==================== - -Parameters can be set at the command line after the module has been -loaded with the command 'modprobe'. -In some distributions, the configuration tools are able to pass parameters -to the driver module. - -If you use the kernel module loader, you can set driver parameters -in the file /etc/modprobe.conf (or /etc/modules.conf in 2.4 or earlier). -To set the driver parameters in this file, proceed as follows: - -1. Insert a line of the form : - options sk98lin ... - For "...", the same syntax is required as described for the command - line parameters of modprobe below. -2. To activate the new parameters, either reboot your computer - or - unload and reload the driver. - The syntax of the driver parameters is: - - modprobe sk98lin parameter=value1[,value2[,value3...]] - - where value1 refers to the first adapter, value2 to the second etc. - -NOTE: All parameters are case sensitive. Write them exactly as shown - below. - -Example: -Suppose you have two adapters. You want to set auto-negotiation -on the first adapter to ON and on the second adapter to OFF. -You also want to set DuplexCapabilities on the first adapter -to FULL, and on the second adapter to HALF. -Then, you must enter: - - modprobe sk98lin AutoNeg_A=On,Off DupCap_A=Full,Half - -NOTE: The number of adapters that can be configured this way is - limited in the driver (file skge.c, constant SK_MAX_CARD_PARAM). - The current limit is 16. If you happen to install - more adapters, adjust this and recompile. - - -4.1 Per-Port Parameters ------------------------- - -These settings are available for each port on the adapter. -In the following description, '?' stands for the port for -which you set the parameter (A or B). - -Speed ------ -Parameter: Speed_? -Values: 10, 100, 1000, Auto -Default: Auto - -This parameter is used to set the speed capabilities. It is only valid -for the SK-98xx V2.0 copper adapters. -Usually, the speed is negotiated between the two ports during link -establishment. If this fails, a port can be forced to a specific setting -with this parameter. - -Auto-Negotiation ----------------- -Parameter: AutoNeg_? -Values: On, Off, Sense -Default: On - -The "Sense"-mode automatically detects whether the link partner supports -auto-negotiation or not. - -Duplex Capabilities -------------------- -Parameter: DupCap_? -Values: Half, Full, Both -Default: Both - -This parameters is only relevant if auto-negotiation for this port is -not set to "Sense". If auto-negotiation is set to "On", all three values -are possible. If it is set to "Off", only "Full" and "Half" are allowed. -This parameter is useful if your link partner does not support all -possible combinations. - -Flow Control ------------- -Parameter: FlowCtrl_? -Values: Sym, SymOrRem, LocSend, None -Default: SymOrRem - -This parameter can be used to set the flow control capabilities the -port reports during auto-negotiation. It can be set for each port -individually. -Possible modes: - -- Sym = Symmetric: both link partners are allowed to send - PAUSE frames - -- SymOrRem = SymmetricOrRemote: both or only remote partner - are allowed to send PAUSE frames - -- LocSend = LocalSend: only local link partner is allowed - to send PAUSE frames - -- None = no link partner is allowed to send PAUSE frames - -NOTE: This parameter is ignored if auto-negotiation is set to "Off". - -Role in Master-Slave-Negotiation (1000Base-T only) --------------------------------------------------- -Parameter: Role_? -Values: Auto, Master, Slave -Default: Auto - -This parameter is only valid for the SK-9821 and SK-9822 adapters. -For two 1000Base-T ports to communicate, one must take the role of the -master (providing timing information), while the other must be the -slave. Usually, this is negotiated between the two ports during link -establishment. If this fails, a port can be forced to a specific setting -with this parameter. - - -4.2 Adapter Parameters ------------------------ - -Connection Type (SK-98xx V2.0 copper adapters only) ---------------- -Parameter: ConType -Values: Auto, 100FD, 100HD, 10FD, 10HD -Default: Auto - -The parameter 'ConType' is a combination of all five per-port parameters -within one single parameter. This simplifies the configuration of both ports -of an adapter card! The different values of this variable reflect the most -meaningful combinations of port parameters. - -The following table shows the values of 'ConType' and the corresponding -combinations of the per-port parameters: - - ConType | DupCap AutoNeg FlowCtrl Role Speed - ----------+------------------------------------------------------ - Auto | Both On SymOrRem Auto Auto - 100FD | Full Off None Auto (ignored) 100 - 100HD | Half Off None Auto (ignored) 100 - 10FD | Full Off None Auto (ignored) 10 - 10HD | Half Off None Auto (ignored) 10 - -Stating any other port parameter together with this 'ConType' variable -will result in a merged configuration of those settings. This due to -the fact, that the per-port parameters (e.g. Speed_? ) have a higher -priority than the combined variable 'ConType'. - -NOTE: This parameter is always used on both ports of the adapter card. - -Interrupt Moderation --------------------- -Parameter: Moderation -Values: None, Static, Dynamic -Default: None - -Interrupt moderation is employed to limit the maximum number of interrupts -the driver has to serve. That is, one or more interrupts (which indicate any -transmit or receive packet to be processed) are queued until the driver -processes them. When queued interrupts are to be served, is determined by the -'IntsPerSec' parameter, which is explained later below. - -Possible modes: - - -- None - No interrupt moderation is applied on the adapter card. - Therefore, each transmit or receive interrupt is served immediately - as soon as it appears on the interrupt line of the adapter card. - - -- Static - Interrupt moderation is applied on the adapter card. - All transmit and receive interrupts are queued until a complete - moderation interval ends. If such a moderation interval ends, all - queued interrupts are processed in one big bunch without any delay. - The term 'static' reflects the fact, that interrupt moderation is - always enabled, regardless how much network load is currently - passing via a particular interface. In addition, the duration of - the moderation interval has a fixed length that never changes while - the driver is operational. - - -- Dynamic - Interrupt moderation might be applied on the adapter card, - depending on the load of the system. If the driver detects that the - system load is too high, the driver tries to shield the system against - too much network load by enabling interrupt moderation. If - at a later - time - the CPU utilization decreases again (or if the network load is - negligible) the interrupt moderation will automatically be disabled. - -Interrupt moderation should be used when the driver has to handle one or more -interfaces with a high network load, which - as a consequence - leads also to a -high CPU utilization. When moderation is applied in such high network load -situations, CPU load might be reduced by 20-30%. - -NOTE: The drawback of using interrupt moderation is an increase of the round- -trip-time (RTT), due to the queueing and serving of interrupts at dedicated -moderation times. - -Interrupts per second ---------------------- -Parameter: IntsPerSec -Values: 30...40000 (interrupts per second) -Default: 2000 - -This parameter is only used if either static or dynamic interrupt moderation -is used on a network adapter card. Using this parameter if no moderation is -applied will lead to no action performed. - -This parameter determines the length of any interrupt moderation interval. -Assuming that static interrupt moderation is to be used, an 'IntsPerSec' -parameter value of 2000 will lead to an interrupt moderation interval of -500 microseconds. - -NOTE: The duration of the moderation interval is to be chosen with care. -At first glance, selecting a very long duration (e.g. only 100 interrupts per -second) seems to be meaningful, but the increase of packet-processing delay -is tremendous. On the other hand, selecting a very short moderation time might -compensate the use of any moderation being applied. - - -Preferred Port --------------- -Parameter: PrefPort -Values: A, B -Default: A - -This is used to force the preferred port to A or B (on dual-port network -adapters). The preferred port is the one that is used if both are detected -as fully functional. - -RLMT Mode (Redundant Link Management Technology) ------------------------------------------------- -Parameter: RlmtMode -Values: CheckLinkState,CheckLocalPort, CheckSeg, DualNet -Default: CheckLinkState - -RLMT monitors the status of the port. If the link of the active port -fails, RLMT switches immediately to the standby link. The virtual link is -maintained as long as at least one 'physical' link is up. - -Possible modes: - - -- CheckLinkState - Check link state only: RLMT uses the link state - reported by the adapter hardware for each individual port to - determine whether a port can be used for all network traffic or - not. - - -- CheckLocalPort - In this mode, RLMT monitors the network path - between the two ports of an adapter by regularly exchanging packets - between them. This mode requires a network configuration in which - the two ports are able to "see" each other (i.e. there must not be - any router between the ports). - - -- CheckSeg - Check local port and segmentation: This mode supports the - same functions as the CheckLocalPort mode and additionally checks - network segmentation between the ports. Therefore, this mode is only - to be used if Gigabit Ethernet switches are installed on the network - that have been configured to use the Spanning Tree protocol. - - -- DualNet - In this mode, ports A and B are used as separate devices. - If you have a dual port adapter, port A will be configured as eth0 - and port B as eth1. Both ports can be used independently with - distinct IP addresses. The preferred port setting is not used. - RLMT is turned off. - -NOTE: RLMT modes CLP and CLPSS are designed to operate in configurations - where a network path between the ports on one adapter exists. - Moreover, they are not designed to work where adapters are connected - back-to-back. -*** - - -5 Large Frame Support -====================== - -The driver supports large frames (also called jumbo frames). Using large -frames can result in an improved throughput if transferring large amounts -of data. -To enable large frames, set the MTU (maximum transfer unit) of the -interface to the desired value (up to 9000), execute the following -command: - ifconfig eth0 mtu 9000 -This will only work if you have two adapters connected back-to-back -or if you use a switch that supports large frames. When using a switch, -it should be configured to allow large frames and auto-negotiation should -be set to OFF. The setting must be configured on all adapters that can be -reached by the large frames. If one adapter is not set to receive large -frames, it will simply drop them. - -You can switch back to the standard ethernet frame size by executing the -following command: - ifconfig eth0 mtu 1500 - -To permanently configure this setting, add a script with the 'ifconfig' -line to the system startup sequence (named something like "S99sk98lin" -in /etc/rc.d/rc2.d). -*** - - -6 VLAN and Link Aggregation Support (IEEE 802.1, 802.1q, 802.3ad) -================================================================== - -The Marvell Yukon/SysKonnect Linux drivers are able to support VLAN and -Link Aggregation according to IEEE standards 802.1, 802.1q, and 802.3ad. -These features are only available after installation of open source -modules available on the Internet: -For VLAN go to: http://www.candelatech.com/~greear/vlan.html -For Link Aggregation go to: http://www.st.rim.or.jp/~yumo - -NOTE: SysKonnect GmbH does not offer any support for these open source - modules and does not take the responsibility for any kind of - failures or problems arising in connection with these modules. - -NOTE: Configuring Link Aggregation on a SysKonnect dual link adapter may - cause problems when unloading the driver. - - -7 Troubleshooting -================== - -If any problems occur during the installation process, check the -following list: - - -Problem: The SK-98xx adapter cannot be found by the driver. -Solution: In /proc/pci search for the following entry: - 'Ethernet controller: SysKonnect SK-98xx ...' - If this entry exists, the SK-98xx or SK-98xx V2.0 adapter has - been found by the system and should be operational. - If this entry does not exist or if the file '/proc/pci' is not - found, there may be a hardware problem or the PCI support may - not be enabled in your kernel. - The adapter can be checked using the diagnostics program which - is available on the SysKonnect web site: - www.syskonnect.com - - Some COMPAQ machines have problems dealing with PCI under Linux. - This problem is described in the 'PCI howto' document - (included in some distributions or available from the - web, e.g. at 'www.linux.org'). - - -Problem: Programs such as 'ifconfig' or 'route' cannot be found or the - error message 'Operation not permitted' is displayed. -Reason: You are not logged in as user 'root'. -Solution: Logout and login as 'root' or change to 'root' via 'su'. - - -Problem: Upon use of the command 'ping <address>' the message - "ping: sendto: Network is unreachable" is displayed. -Reason: Your route is not set correctly. -Solution: If you are using RedHat, you probably forgot to set up the - route in the 'network configuration'. - Check the existing routes with the 'route' command and check - if an entry for 'eth0' exists, and if so, if it is set correctly. - - -Problem: The driver can be started, the adapter is connected to the - network, but you cannot receive or transmit any packets; - e.g. 'ping' does not work. -Reason: There is an incorrect route in your routing table. -Solution: Check the routing table with the command 'route' and read the - manual help pages dealing with routes (enter 'man route'). - -NOTE: Although the 2.2.x kernel versions generate the routing entry - automatically, problems of this kind may occur here as well. We've - come across a situation in which the driver started correctly at - system start, but after the driver has been removed and reloaded, - the route of the adapter's network pointed to the 'dummy0'device - and had to be corrected manually. - - -Problem: Your computer should act as a router between multiple - IP subnetworks (using multiple adapters), but computers in - other subnetworks cannot be reached. -Reason: Either the router's kernel is not configured for IP forwarding - or the routing table and gateway configuration of at least one - computer is not working. - -Problem: Upon driver start, the following error message is displayed: - "eth0: -- ERROR -- - Class: internal Software error - Nr: 0xcc - Msg: SkGeInitPort() cannot init running ports" -Reason: You are using a driver compiled for single processor machines - on a multiprocessor machine with SMP (Symmetric MultiProcessor) - kernel. -Solution: Configure your kernel appropriately and recompile the kernel or - the modules. - - - -If your problem is not listed here, please contact SysKonnect's technical -support for help (linux@syskonnect.de). -When contacting our technical support, please ensure that the following -information is available: -- System Manufacturer and HW Informations (CPU, Memory... ) -- PCI-Boards in your system -- Distribution -- Kernel version -- Driver version -*** - - - -***End of Readme File*** diff --git a/Documentation/networking/spider_net.txt b/Documentation/networking/spider_net.txt new file mode 100644 index 000000000000..4b4adb8eb14f --- /dev/null +++ b/Documentation/networking/spider_net.txt @@ -0,0 +1,204 @@ + + The Spidernet Device Driver + =========================== + +Written by Linas Vepstas <linas@austin.ibm.com> + +Version of 7 June 2007 + +Abstract +======== +This document sketches the structure of portions of the spidernet +device driver in the Linux kernel tree. The spidernet is a gigabit +ethernet device built into the Toshiba southbridge commonly used +in the SONY Playstation 3 and the IBM QS20 Cell blade. + +The Structure of the RX Ring. +============================= +The receive (RX) ring is a circular linked list of RX descriptors, +together with three pointers into the ring that are used to manage its +contents. + +The elements of the ring are called "descriptors" or "descrs"; they +describe the received data. This includes a pointer to a buffer +containing the received data, the buffer size, and various status bits. + +There are three primary states that a descriptor can be in: "empty", +"full" and "not-in-use". An "empty" or "ready" descriptor is ready +to receive data from the hardware. A "full" descriptor has data in it, +and is waiting to be emptied and processed by the OS. A "not-in-use" +descriptor is neither empty or full; it is simply not ready. It may +not even have a data buffer in it, or is otherwise unusable. + +During normal operation, on device startup, the OS (specifically, the +spidernet device driver) allocates a set of RX descriptors and RX +buffers. These are all marked "empty", ready to receive data. This +ring is handed off to the hardware, which sequentially fills in the +buffers, and marks them "full". The OS follows up, taking the full +buffers, processing them, and re-marking them empty. + +This filling and emptying is managed by three pointers, the "head" +and "tail" pointers, managed by the OS, and a hardware current +descriptor pointer (GDACTDPA). The GDACTDPA points at the descr +currently being filled. When this descr is filled, the hardware +marks it full, and advances the GDACTDPA by one. Thus, when there is +flowing RX traffic, every descr behind it should be marked "full", +and everything in front of it should be "empty". If the hardware +discovers that the current descr is not empty, it will signal an +interrupt, and halt processing. + +The tail pointer tails or trails the hardware pointer. When the +hardware is ahead, the tail pointer will be pointing at a "full" +descr. The OS will process this descr, and then mark it "not-in-use", +and advance the tail pointer. Thus, when there is flowing RX traffic, +all of the descrs in front of the tail pointer should be "full", and +all of those behind it should be "not-in-use". When RX traffic is not +flowing, then the tail pointer can catch up to the hardware pointer. +The OS will then note that the current tail is "empty", and halt +processing. + +The head pointer (somewhat mis-named) follows after the tail pointer. +When traffic is flowing, then the head pointer will be pointing at +a "not-in-use" descr. The OS will perform various housekeeping duties +on this descr. This includes allocating a new data buffer and +dma-mapping it so as to make it visible to the hardware. The OS will +then mark the descr as "empty", ready to receive data. Thus, when there +is flowing RX traffic, everything in front of the head pointer should +be "not-in-use", and everything behind it should be "empty". If no +RX traffic is flowing, then the head pointer can catch up to the tail +pointer, at which point the OS will notice that the head descr is +"empty", and it will halt processing. + +Thus, in an idle system, the GDACTDPA, tail and head pointers will +all be pointing at the same descr, which should be "empty". All of the +other descrs in the ring should be "empty" as well. + +The show_rx_chain() routine will print out the the locations of the +GDACTDPA, tail and head pointers. It will also summarize the contents +of the ring, starting at the tail pointer, and listing the status +of the descrs that follow. + +A typical example of the output, for a nearly idle system, might be + +net eth1: Total number of descrs=256 +net eth1: Chain tail located at descr=20 +net eth1: Chain head is at 20 +net eth1: HW curr desc (GDACTDPA) is at 21 +net eth1: Have 1 descrs with stat=x40800101 +net eth1: HW next desc (GDACNEXTDA) is at 22 +net eth1: Last 255 descrs with stat=xa0800000 + +In the above, the hardware has filled in one descr, number 20. Both +head and tail are pointing at 20, because it has not yet been emptied. +Meanwhile, hw is pointing at 21, which is free. + +The "Have nnn decrs" refers to the descr starting at the tail: in this +case, nnn=1 descr, starting at descr 20. The "Last nnn descrs" refers +to all of the rest of the descrs, from the last status change. The "nnn" +is a count of how many descrs have exactly the same status. + +The status x4... corresponds to "full" and status xa... corresponds +to "empty". The actual value printed is RXCOMST_A. + +In the device driver source code, a different set of names are +used for these same concepts, so that + +"empty" == SPIDER_NET_DESCR_CARDOWNED == 0xa +"full" == SPIDER_NET_DESCR_FRAME_END == 0x4 +"not in use" == SPIDER_NET_DESCR_NOT_IN_USE == 0xf + + +The RX RAM full bug/feature +=========================== + +As long as the OS can empty out the RX buffers at a rate faster than +the hardware can fill them, there is no problem. If, for some reason, +the OS fails to empty the RX ring fast enough, the hardware GDACTDPA +pointer will catch up to the head, notice the not-empty condition, +ad stop. However, RX packets may still continue arriving on the wire. +The spidernet chip can save some limited number of these in local RAM. +When this local ram fills up, the spider chip will issue an interrupt +indicating this (GHIINT0STS will show ERRINT, and the GRMFLLINT bit +will be set in GHIINT1STS). When the RX ram full condition occurs, +a certain bug/feature is triggered that has to be specially handled. +This section describes the special handling for this condition. + +When the OS finally has a chance to run, it will empty out the RX ring. +In particular, it will clear the descriptor on which the hardware had +stopped. However, once the hardware has decided that a certain +descriptor is invalid, it will not restart at that descriptor; instead +it will restart at the next descr. This potentially will lead to a +deadlock condition, as the tail pointer will be pointing at this descr, +which, from the OS point of view, is empty; the OS will be waiting for +this descr to be filled. However, the hardware has skipped this descr, +and is filling the next descrs. Since the OS doesn't see this, there +is a potential deadlock, with the OS waiting for one descr to fill, +while the hardware is waiting for a different set of descrs to become +empty. + +A call to show_rx_chain() at this point indicates the nature of the +problem. A typical print when the network is hung shows the following: + +net eth1: Spider RX RAM full, incoming packets might be discarded! +net eth1: Total number of descrs=256 +net eth1: Chain tail located at descr=255 +net eth1: Chain head is at 255 +net eth1: HW curr desc (GDACTDPA) is at 0 +net eth1: Have 1 descrs with stat=xa0800000 +net eth1: HW next desc (GDACNEXTDA) is at 1 +net eth1: Have 127 descrs with stat=x40800101 +net eth1: Have 1 descrs with stat=x40800001 +net eth1: Have 126 descrs with stat=x40800101 +net eth1: Last 1 descrs with stat=xa0800000 + +Both the tail and head pointers are pointing at descr 255, which is +marked xa... which is "empty". Thus, from the OS point of view, there +is nothing to be done. In particular, there is the implicit assumption +that everything in front of the "empty" descr must surely also be empty, +as explained in the last section. The OS is waiting for descr 255 to +become non-empty, which, in this case, will never happen. + +The HW pointer is at descr 0. This descr is marked 0x4.. or "full". +Since its already full, the hardware can do nothing more, and thus has +halted processing. Notice that descrs 0 through 254 are all marked +"full", while descr 254 and 255 are empty. (The "Last 1 descrs" is +descr 254, since tail was at 255.) Thus, the system is deadlocked, +and there can be no forward progress; the OS thinks there's nothing +to do, and the hardware has nowhere to put incoming data. + +This bug/feature is worked around with the spider_net_resync_head_ptr() +routine. When the driver receives RX interrupts, but an examination +of the RX chain seems to show it is empty, then it is probable that +the hardware has skipped a descr or two (sometimes dozens under heavy +network conditions). The spider_net_resync_head_ptr() subroutine will +search the ring for the next full descr, and the driver will resume +operations there. Since this will leave "holes" in the ring, there +is also a spider_net_resync_tail_ptr() that will skip over such holes. + +As of this writing, the spider_net_resync() strategy seems to work very +well, even under heavy network loads. + + +The TX ring +=========== +The TX ring uses a low-watermark interrupt scheme to make sure that +the TX queue is appropriately serviced for large packet sizes. + +For packet sizes greater than about 1KBytes, the kernel can fill +the TX ring quicker than the device can drain it. Once the ring +is full, the netdev is stopped. When there is room in the ring, +the netdev needs to be reawakened, so that more TX packets are placed +in the ring. The hardware can empty the ring about four times per jiffy, +so its not appropriate to wait for the poll routine to refill, since +the poll routine runs only once per jiffy. The low-watermark mechanism +marks a descr about 1/4th of the way from the bottom of the queue, so +that an interrupt is generated when the descr is processed. This +interrupt wakes up the netdev, which can then refill the queue. +For large packets, this mechanism generates a relatively small number +of interrupts, about 1K/sec. For smaller packets, this will drop to zero +interrupts, as the hardware can empty the queue faster than the kernel +can fill it. + + + ======= END OF DOCUMENT ======== + diff --git a/Documentation/oops-tracing.txt b/Documentation/oops-tracing.txt index 7d5b60dea551..7f60dfe642ca 100644 --- a/Documentation/oops-tracing.txt +++ b/Documentation/oops-tracing.txt @@ -86,6 +86,20 @@ stuff are the values reported by the Oops - you can just cut-and-paste and do a replace of spaces to "\x" - that's what I do, as I'm too lazy to write a program to automate this all). +Alternatively, you can use the shell script in scripts/decodecode. +Its usage is: decodecode < oops.txt + +The hex bytes that follow "Code:" may (in some architectures) have a series +of bytes that precede the current instruction pointer as well as bytes at and +following the current instruction pointer. In some cases, one instruction +byte or word is surrounded by <> or (), as in "<86>" or "(f00d)". These +<> or () markings indicate the current instruction pointer. Example from +i386, split into multiple lines for readability: + +Code: f9 0f 8d f9 00 00 00 8d 42 0c e8 dd 26 11 c7 a1 60 ea 2b f9 8b 50 08 a1 +64 ea 2b f9 8d 34 82 8b 1e 85 db 74 6d 8b 15 60 ea 2b f9 <8b> 43 04 39 42 54 +7e 04 40 89 42 54 8b 43 04 3b 05 00 f6 52 c0 + Finally, if you want to see where the code comes from, you can do cd /usr/src/linux @@ -237,6 +251,8 @@ characters, each representing a particular tainted value. 7: 'U' if a user or user application specifically requested that the Tainted flag be set, ' ' otherwise. + 8: 'D' if the kernel has died recently, i.e. there was an OOPS or BUG. + The primary reason for the 'Tainted: ' string is to tell kernel debuggers if this is a clean kernel or if anything unusual has occurred. Tainting is permanent: even if an offending module is diff --git a/Documentation/pci.txt b/Documentation/pci.txt index d38261b67905..7754f5aea4e9 100644 --- a/Documentation/pci.txt +++ b/Documentation/pci.txt @@ -113,9 +113,6 @@ initialization with a pointer to a structure describing the driver (Please see Documentation/power/pci.txt for descriptions of PCI Power Management and the related functions.) - enable_wake Enable device to generate wake events from a low power - state. - shutdown Hook into reboot_notifier_list (kernel/sys.c). Intended to stop any idling DMA operations. Useful for enabling wake-on-lan (NIC) or changing @@ -299,7 +296,10 @@ If the PCI device can use the PCI Memory-Write-Invalidate transaction, call pci_set_mwi(). This enables the PCI_COMMAND bit for Mem-Wr-Inval and also ensures that the cache line size register is set correctly. Check the return value of pci_set_mwi() as not all architectures -or chip-sets may support Memory-Write-Invalidate. +or chip-sets may support Memory-Write-Invalidate. Alternatively, +if Mem-Wr-Inval would be nice to have but is not required, call +pci_try_set_mwi() to have the system do its best effort at enabling +Mem-Wr-Inval. 3.2 Request MMIO/IOP resources diff --git a/Documentation/power/freezing-of-tasks.txt b/Documentation/power/freezing-of-tasks.txt new file mode 100644 index 000000000000..04dc1cf9d215 --- /dev/null +++ b/Documentation/power/freezing-of-tasks.txt @@ -0,0 +1,162 @@ +Freezing of tasks + (C) 2007 Rafael J. Wysocki <rjw@sisk.pl>, GPL + +I. What is the freezing of tasks? + +The freezing of tasks is a mechanism by which user space processes and some +kernel threads are controlled during hibernation or system-wide suspend (on some +architectures). + +II. How does it work? + +There are four per-task flags used for that, PF_NOFREEZE, PF_FROZEN, TIF_FREEZE +and PF_FREEZER_SKIP (the last one is auxiliary). The tasks that have +PF_NOFREEZE unset (all user space processes and some kernel threads) are +regarded as 'freezable' and treated in a special way before the system enters a +suspend state as well as before a hibernation image is created (in what follows +we only consider hibernation, but the description also applies to suspend). + +Namely, as the first step of the hibernation procedure the function +freeze_processes() (defined in kernel/power/process.c) is called. It executes +try_to_freeze_tasks() that sets TIF_FREEZE for all of the freezable tasks and +sends a fake signal to each of them. A task that receives such a signal and has +TIF_FREEZE set, should react to it by calling the refrigerator() function +(defined in kernel/power/process.c), which sets the task's PF_FROZEN flag, +changes its state to TASK_UNINTERRUPTIBLE and makes it loop until PF_FROZEN is +cleared for it. Then, we say that the task is 'frozen' and therefore the set of +functions handling this mechanism is called 'the freezer' (these functions are +defined in kernel/power/process.c and include/linux/freezer.h). User space +processes are generally frozen before kernel threads. + +It is not recommended to call refrigerator() directly. Instead, it is +recommended to use the try_to_freeze() function (defined in +include/linux/freezer.h), that checks the task's TIF_FREEZE flag and makes the +task enter refrigerator() if the flag is set. + +For user space processes try_to_freeze() is called automatically from the +signal-handling code, but the freezable kernel threads need to call it +explicitly in suitable places. The code to do this may look like the following: + + do { + hub_events(); + wait_event_interruptible(khubd_wait, + !list_empty(&hub_event_list)); + try_to_freeze(); + } while (!signal_pending(current)); + +(from drivers/usb/core/hub.c::hub_thread()). + +If a freezable kernel thread fails to call try_to_freeze() after the freezer has +set TIF_FREEZE for it, the freezing of tasks will fail and the entire +hibernation operation will be cancelled. For this reason, freezable kernel +threads must call try_to_freeze() somewhere. + +After the system memory state has been restored from a hibernation image and +devices have been reinitialized, the function thaw_processes() is called in +order to clear the PF_FROZEN flag for each frozen task. Then, the tasks that +have been frozen leave refrigerator() and continue running. + +III. Which kernel threads are freezable? + +Kernel threads are not freezable by default. However, a kernel thread may clear +PF_NOFREEZE for itself by calling set_freezable() (the resetting of PF_NOFREEZE +directly is strongly discouraged). From this point it is regarded as freezable +and must call try_to_freeze() in a suitable place. + +IV. Why do we do that? + +Generally speaking, there is a couple of reasons to use the freezing of tasks: + +1. The principal reason is to prevent filesystems from being damaged after +hibernation. At the moment we have no simple means of checkpointing +filesystems, so if there are any modifications made to filesystem data and/or +metadata on disks, we cannot bring them back to the state from before the +modifications. At the same time each hibernation image contains some +filesystem-related information that must be consistent with the state of the +on-disk data and metadata after the system memory state has been restored from +the image (otherwise the filesystems will be damaged in a nasty way, usually +making them almost impossible to repair). We therefore freeze tasks that might +cause the on-disk filesystems' data and metadata to be modified after the +hibernation image has been created and before the system is finally powered off. +The majority of these are user space processes, but if any of the kernel threads +may cause something like this to happen, they have to be freezable. + +2. The second reason is to prevent user space processes and some kernel threads +from interfering with the suspending and resuming of devices. A user space +process running on a second CPU while we are suspending devices may, for +example, be troublesome and without the freezing of tasks we would need some +safeguards against race conditions that might occur in such a case. + +Although Linus Torvalds doesn't like the freezing of tasks, he said this in one +of the discussions on LKML (http://lkml.org/lkml/2007/4/27/608): + +"RJW:> Why we freeze tasks at all or why we freeze kernel threads? + +Linus: In many ways, 'at all'. + +I _do_ realize the IO request queue issues, and that we cannot actually do +s2ram with some devices in the middle of a DMA. So we want to be able to +avoid *that*, there's no question about that. And I suspect that stopping +user threads and then waiting for a sync is practically one of the easier +ways to do so. + +So in practice, the 'at all' may become a 'why freeze kernel threads?' and +freezing user threads I don't find really objectionable." + +Still, there are kernel threads that may want to be freezable. For example, if +a kernel that belongs to a device driver accesses the device directly, it in +principle needs to know when the device is suspended, so that it doesn't try to +access it at that time. However, if the kernel thread is freezable, it will be +frozen before the driver's .suspend() callback is executed and it will be +thawed after the driver's .resume() callback has run, so it won't be accessing +the device while it's suspended. + +3. Another reason for freezing tasks is to prevent user space processes from +realizing that hibernation (or suspend) operation takes place. Ideally, user +space processes should not notice that such a system-wide operation has occurred +and should continue running without any problems after the restore (or resume +from suspend). Unfortunately, in the most general case this is quite difficult +to achieve without the freezing of tasks. Consider, for example, a process +that depends on all CPUs being online while it's running. Since we need to +disable nonboot CPUs during the hibernation, if this process is not frozen, it +may notice that the number of CPUs has changed and may start to work incorrectly +because of that. + +V. Are there any problems related to the freezing of tasks? + +Yes, there are. + +First of all, the freezing of kernel threads may be tricky if they depend one +on another. For example, if kernel thread A waits for a completion (in the +TASK_UNINTERRUPTIBLE state) that needs to be done by freezable kernel thread B +and B is frozen in the meantime, then A will be blocked until B is thawed, which +may be undesirable. That's why kernel threads are not freezable by default. + +Second, there are the following two problems related to the freezing of user +space processes: +1. Putting processes into an uninterruptible sleep distorts the load average. +2. Now that we have FUSE, plus the framework for doing device drivers in +userspace, it gets even more complicated because some userspace processes are +now doing the sorts of things that kernel threads do +(https://lists.linux-foundation.org/pipermail/linux-pm/2007-May/012309.html). + +The problem 1. seems to be fixable, although it hasn't been fixed so far. The +other one is more serious, but it seems that we can work around it by using +hibernation (and suspend) notifiers (in that case, though, we won't be able to +avoid the realization by the user space processes that the hibernation is taking +place). + +There are also problems that the freezing of tasks tends to expose, although +they are not directly related to it. For example, if request_firmware() is +called from a device driver's .resume() routine, it will timeout and eventually +fail, because the user land process that should respond to the request is frozen +at this point. So, seemingly, the failure is due to the freezing of tasks. +Suppose, however, that the firmware file is located on a filesystem accessible +only through another device that hasn't been resumed yet. In that case, +request_firmware() will fail regardless of whether or not the freezing of tasks +is used. Consequently, the problem is not really related to the freezing of +tasks, since it generally exists anyway. + +A driver must have all firmwares it may need in RAM before suspend() is called. +If keeping them is not practical, for example due to their size, they must be +requested early enough using the suspend notifier API described in notifiers.txt. diff --git a/Documentation/power/kernel_threads.txt b/Documentation/power/kernel_threads.txt deleted file mode 100644 index fb57784986b1..000000000000 --- a/Documentation/power/kernel_threads.txt +++ /dev/null @@ -1,40 +0,0 @@ -KERNEL THREADS - - -Freezer - -Upon entering a suspended state the system will freeze all -tasks. This is done by delivering pseudosignals. This affects -kernel threads, too. To successfully freeze a kernel thread -the thread has to check for the pseudosignal and enter the -refrigerator. Code to do this looks like this: - - do { - hub_events(); - wait_event_interruptible(khubd_wait, !list_empty(&hub_event_list)); - try_to_freeze(); - } while (!signal_pending(current)); - -from drivers/usb/core/hub.c::hub_thread() - - -The Unfreezable - -Some kernel threads however, must not be frozen. The kernel must -be able to finish pending IO operations and later on be able to -write the memory image to disk. Kernel threads needed to do IO -must stay awake. Such threads must mark themselves unfreezable -like this: - - /* - * This thread doesn't need any user-level access, - * so get rid of all our resources. - */ - daemonize("usb-storage"); - - current->flags |= PF_NOFREEZE; - -from drivers/usb/storage/usb.c::usb_stor_control_thread() - -Such drivers are themselves responsible for staying quiet during -the actual snapshotting. diff --git a/Documentation/power/notifiers.txt b/Documentation/power/notifiers.txt new file mode 100644 index 000000000000..9293e4bc857c --- /dev/null +++ b/Documentation/power/notifiers.txt @@ -0,0 +1,50 @@ +Suspend notifiers + (C) 2007 Rafael J. Wysocki <rjw@sisk.pl>, GPL + +There are some operations that device drivers may want to carry out in their +.suspend() routines, but shouldn't, because they can cause the hibernation or +suspend to fail. For example, a driver may want to allocate a substantial amount +of memory (like 50 MB) in .suspend(), but that shouldn't be done after the +swsusp's memory shrinker has run. + +Also, there may be some operations, that subsystems want to carry out before a +hibernation/suspend or after a restore/resume, requiring the system to be fully +functional, so the drivers' .suspend() and .resume() routines are not suitable +for this purpose. For example, device drivers may want to upload firmware to +their devices after a restore from a hibernation image, but they cannot do it by +calling request_firmware() from their .resume() routines (user land processes +are frozen at this point). The solution may be to load the firmware into +memory before processes are frozen and upload it from there in the .resume() +routine. Of course, a hibernation notifier may be used for this purpose. + +The subsystems that have such needs can register suspend notifiers that will be +called upon the following events by the suspend core: + +PM_HIBERNATION_PREPARE The system is going to hibernate or suspend, tasks will + be frozen immediately. + +PM_POST_HIBERNATION The system memory state has been restored from a + hibernation image or an error occured during the + hibernation. Device drivers' .resume() callbacks have + been executed and tasks have been thawed. + +PM_SUSPEND_PREPARE The system is preparing for a suspend. + +PM_POST_SUSPEND The system has just resumed or an error occured during + the suspend. Device drivers' .resume() callbacks have + been executed and tasks have been thawed. + +It is generally assumed that whatever the notifiers do for +PM_HIBERNATION_PREPARE, should be undone for PM_POST_HIBERNATION. Analogously, +operations performed for PM_SUSPEND_PREPARE should be reversed for +PM_POST_SUSPEND. Additionally, all of the notifiers are called for +PM_POST_HIBERNATION if one of them fails for PM_HIBERNATION_PREPARE, and +all of the notifiers are called for PM_POST_SUSPEND if one of them fails for +PM_SUSPEND_PREPARE. + +The hibernation and suspend notifiers are called with pm_mutex held. They are +defined in the usual way, but their last argument is meaningless (it is always +NULL). To register and/or unregister a suspend notifier use the functions +register_pm_notifier() and unregister_pm_notifier(), respectively, defined in +include/linux/suspend.h . If you don't need to unregister the notifier, you can +also use the pm_notifier() macro defined in include/linux/suspend.h . diff --git a/Documentation/power/pci.txt b/Documentation/power/pci.txt index e00b099a4b86..dd8fe43888d3 100644 --- a/Documentation/power/pci.txt +++ b/Documentation/power/pci.txt @@ -164,7 +164,6 @@ struct pci_driver: int (*suspend) (struct pci_dev *dev, pm_message_t state); int (*resume) (struct pci_dev *dev); - int (*enable_wake) (struct pci_dev *dev, pci_power_t state, int enable); suspend @@ -251,42 +250,6 @@ The driver should update the current_state field in its pci_dev structure in this function, except for PM-capable devices when pci_set_power_state is used. -enable_wake ------------ - -Usage: - -if (dev->driver && dev->driver->enable_wake) - dev->driver->enable_wake(dev,state,enable); - -This callback is generally only relevant for devices that support the PCI PM -spec and have the ability to generate a PME# (Power Management Event Signal) -to wake the system up. (However, it is possible that a device may support -some non-standard way of generating a wake event on sleep.) - -Bits 15:11 of the PMC (Power Mgmt Capabilities) Register in a device's -PM Capabilities describe what power states the device supports generating a -wake event from: - -+------------------+ -| Bit | State | -+------------------+ -| 11 | D0 | -| 12 | D1 | -| 13 | D2 | -| 14 | D3hot | -| 15 | D3cold | -+------------------+ - -A device can use this to enable wake events: - - pci_enable_wake(dev,state,enable); - -Note that to enable PME# from D3cold, a value of 4 should be passed to -pci_enable_wake (since it uses an index into a bitmask). If a driver gets -a request to enable wake events from D3, two calls should be made to -pci_enable_wake (one for both D3hot and D3cold). - A reference implementation ------------------------- diff --git a/Documentation/power/swsusp.txt b/Documentation/power/swsusp.txt index 5b8d6953f05e..aea7e9209667 100644 --- a/Documentation/power/swsusp.txt +++ b/Documentation/power/swsusp.txt @@ -140,21 +140,11 @@ should be sent to the mailing list available through the suspend2 website, and not to the Linux Kernel Mailing List. We are working toward merging suspend2 into the mainline kernel. -Q: A kernel thread must voluntarily freeze itself (call 'refrigerator'). -I found some kernel threads that don't do it, and they don't freeze -so the system can't sleep. Is this a known behavior? - -A: All such kernel threads need to be fixed, one by one. Select the -place where the thread is safe to be frozen (no kernel semaphores -should be held at that point and it must be safe to sleep there), and -add: - - try_to_freeze(); - -If the thread is needed for writing the image to storage, you should -instead set the PF_NOFREEZE process flag when creating the thread (and -be very careful). +Q: What is the freezing of tasks and why are we using it? +A: The freezing of tasks is a mechanism by which user space processes and some +kernel threads are controlled during hibernation or system-wide suspend (on some +architectures). See freezing-of-tasks.txt for details. Q: What is the difference between "platform" and "shutdown"? @@ -393,6 +383,9 @@ safest thing is to unmount all filesystems on removable media (such USB, Firewire, CompactFlash, MMC, external SATA, or even IDE hotplug bays) before suspending; then remount them after resuming. +There is a work-around for this problem. For more information, see +Documentation/usb/persist.txt. + Q: I upgraded the kernel from 2.6.15 to 2.6.16. Both kernels were compiled with the similar configuration files. Anyway I found that suspend to disk (and resume) is much slower on 2.6.16 compared to diff --git a/Documentation/power_supply_class.txt b/Documentation/power_supply_class.txt new file mode 100644 index 000000000000..9758cf433c06 --- /dev/null +++ b/Documentation/power_supply_class.txt @@ -0,0 +1,167 @@ +Linux power supply class +======================== + +Synopsis +~~~~~~~~ +Power supply class used to represent battery, UPS, AC or DC power supply +properties to user-space. + +It defines core set of attributes, which should be applicable to (almost) +every power supply out there. Attributes are available via sysfs and uevent +interfaces. + +Each attribute has well defined meaning, up to unit of measure used. While +the attributes provided are believed to be universally applicable to any +power supply, specific monitoring hardware may not be able to provide them +all, so any of them may be skipped. + +Power supply class is extensible, and allows to define drivers own attributes. +The core attribute set is subject to the standard Linux evolution (i.e. +if it will be found that some attribute is applicable to many power supply +types or their drivers, it can be added to the core set). + +It also integrates with LED framework, for the purpose of providing +typically expected feedback of battery charging/fully charged status and +AC/USB power supply online status. (Note that specific details of the +indication (including whether to use it at all) are fully controllable by +user and/or specific machine defaults, per design principles of LED +framework). + + +Attributes/properties +~~~~~~~~~~~~~~~~~~~~~ +Power supply class has predefined set of attributes, this eliminates code +duplication across drivers. Power supply class insist on reusing its +predefined attributes *and* their units. + +So, userspace gets predictable set of attributes and their units for any +kind of power supply, and can process/present them to a user in consistent +manner. Results for different power supplies and machines are also directly +comparable. + +See drivers/power/ds2760_battery.c and drivers/power/pda_power.c for the +example how to declare and handle attributes. + + +Units +~~~~~ +Quoting include/linux/power_supply.h: + + All voltages, currents, charges, energies, time and temperatures in µV, + µA, µAh, µWh, seconds and tenths of degree Celsius unless otherwise + stated. It's driver's job to convert its raw values to units in which + this class operates. + + +Attributes/properties detailed +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +~ ~ ~ ~ ~ ~ ~ Charge/Energy/Capacity - how to not confuse ~ ~ ~ ~ ~ ~ ~ +~ ~ +~ Because both "charge" (µAh) and "energy" (µWh) represents "capacity" ~ +~ of battery, this class distinguish these terms. Don't mix them! ~ +~ ~ +~ CHARGE_* attributes represents capacity in µAh only. ~ +~ ENERGY_* attributes represents capacity in µWh only. ~ +~ CAPACITY attribute represents capacity in *percents*, from 0 to 100. ~ +~ ~ +~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ + +Postfixes: +_AVG - *hardware* averaged value, use it if your hardware is really able to +report averaged values. +_NOW - momentary/instantaneous values. + +STATUS - this attribute represents operating status (charging, full, +discharging (i.e. powering a load), etc.). This corresponds to +BATTERY_STATUS_* values, as defined in battery.h. + +HEALTH - represents health of the battery, values corresponds to +POWER_SUPPLY_HEALTH_*, defined in battery.h. + +VOLTAGE_MAX_DESIGN, VOLTAGE_MIN_DESIGN - design values for maximal and +minimal power supply voltages. Maximal/minimal means values of voltages +when battery considered "full"/"empty" at normal conditions. Yes, there is +no direct relation between voltage and battery capacity, but some dumb +batteries use voltage for very approximated calculation of capacity. +Battery driver also can use this attribute just to inform userspace +about maximal and minimal voltage thresholds of a given battery. + +CHARGE_FULL_DESIGN, CHARGE_EMPTY_DESIGN - design charge values, when +battery considered full/empty. + +ENERGY_FULL_DESIGN, ENERGY_EMPTY_DESIGN - same as above but for energy. + +CHARGE_FULL, CHARGE_EMPTY - These attributes means "last remembered value +of charge when battery became full/empty". It also could mean "value of +charge when battery considered full/empty at given conditions (temperature, +age)". I.e. these attributes represents real thresholds, not design values. + +ENERGY_FULL, ENERGY_EMPTY - same as above but for energy. + +CAPACITY - capacity in percents. +CAPACITY_LEVEL - capacity level. This corresponds to +POWER_SUPPLY_CAPACITY_LEVEL_*. + +TEMP - temperature of the power supply. +TEMP_AMBIENT - ambient temperature. + +TIME_TO_EMPTY - seconds left for battery to be considered empty (i.e. +while battery powers a load) +TIME_TO_FULL - seconds left for battery to be considered full (i.e. +while battery is charging) + + +Battery <-> external power supply interaction +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Often power supplies are acting as supplies and supplicants at the same +time. Batteries are good example. So, batteries usually care if they're +externally powered or not. + +For that case, power supply class implements notification mechanism for +batteries. + +External power supply (AC) lists supplicants (batteries) names in +"supplied_to" struct member, and each power_supply_changed() call +issued by external power supply will notify supplicants via +external_power_changed callback. + + +QA +~~ +Q: Where is POWER_SUPPLY_PROP_XYZ attribute? +A: If you cannot find attribute suitable for your driver needs, feel free + to add it and send patch along with your driver. + + The attributes available currently are the ones currently provided by the + drivers written. + + Good candidates to add in future: model/part#, cycle_time, manufacturer, + etc. + + +Q: I have some very specific attribute (e.g. battery color), should I add + this attribute to standard ones? +A: Most likely, no. Such attribute can be placed in the driver itself, if + it is useful. Of course, if the attribute in question applicable to + large set of batteries, provided by many drivers, and/or comes from + some general battery specification/standard, it may be a candidate to + be added to the core attribute set. + + +Q: Suppose, my battery monitoring chip/firmware does not provides capacity + in percents, but provides charge_{now,full,empty}. Should I calculate + percentage capacity manually, inside the driver, and register CAPACITY + attribute? The same question about time_to_empty/time_to_full. +A: Most likely, no. This class is designed to export properties which are + directly measurable by the specific hardware available. + + Inferring not available properties using some heuristics or mathematical + model is not subject of work for a battery driver. Such functionality + should be factored out, and in fact, apm_power, the driver to serve + legacy APM API on top of power supply class, uses a simple heuristic of + approximating remaining battery capacity based on its charge, current, + voltage and so on. But full-fledged battery model is likely not subject + for kernel at all, as it would require floating point calculation to deal + with things like differential equations and Kalman filters. This is + better be handled by batteryd/libbattery, yet to be written. diff --git a/Documentation/powerpc/booting-without-of.txt b/Documentation/powerpc/booting-without-of.txt index b49ce169a63a..76733a3962f0 100644 --- a/Documentation/powerpc/booting-without-of.txt +++ b/Documentation/powerpc/booting-without-of.txt @@ -1,7 +1,6 @@ Booting the Linux/ppc kernel without Open Firmware -------------------------------------------------- - (c) 2005 Benjamin Herrenschmidt <benh at kernel.crashing.org>, IBM Corp. (c) 2005 Becky Bruce <becky.bruce at freescale.com>, @@ -9,6 +8,63 @@ (c) 2006 MontaVista Software, Inc. Flash chip node definition +Table of Contents +================= + + I - Introduction + 1) Entry point for arch/powerpc + 2) Board support + + II - The DT block format + 1) Header + 2) Device tree generalities + 3) Device tree "structure" block + 4) Device tree "strings" block + + III - Required content of the device tree + 1) Note about cells and address representation + 2) Note about "compatible" properties + 3) Note about "name" properties + 4) Note about node and property names and character set + 5) Required nodes and properties + a) The root node + b) The /cpus node + c) The /cpus/* nodes + d) the /memory node(s) + e) The /chosen node + f) the /soc<SOCname> node + + IV - "dtc", the device tree compiler + + V - Recommendations for a bootloader + + VI - System-on-a-chip devices and nodes + 1) Defining child nodes of an SOC + 2) Representing devices without a current OF specification + a) MDIO IO device + b) Gianfar-compatible ethernet nodes + c) PHY nodes + d) Interrupt controllers + e) I2C + f) Freescale SOC USB controllers + g) Freescale SOC SEC Security Engines + h) Board Control and Status (BCSR) + i) Freescale QUICC Engine module (QE) + j) Flash chip nodes + k) Global Utilities Block + + VII - Specifying interrupt information for devices + 1) interrupts property + 2) interrupt-parent property + 3) OpenPIC Interrupt Controllers + 4) ISA Interrupt Controllers + + Appendix A - Sample SOC node for MPC8540 + + +Revision Information +==================== + May 18, 2005: Rev 0.1 - Initial draft, no chapter III yet. May 19, 2005: Rev 0.2 - Add chapter III and bits & pieces here or @@ -571,6 +627,14 @@ So the node content can be summarized as a start token, a full path, a list of properties, a list of child nodes, and an end token. Every child node is a full node structure itself as defined above. +NOTE: The above definition requires that all property definitions for +a particular node MUST precede any subnode definitions for that node. +Although the structure would not be ambiguous if properties and +subnodes were intermingled, the kernel parser requires that the +properties come first (up until at least 2.6.22). Any tools +manipulating a flattened tree must take care to preserve this +constraint. + 4) Device tree "strings" block In order to save space, property names, which are generally redundant, @@ -1186,6 +1250,12 @@ platforms are moved over to use the flattened-device-tree model. network device. This is used by the bootwrapper to interpret MAC addresses passed by the firmware when no information other than indices is available to associate an address with a device. + - phy-connection-type : a string naming the controller/PHY interface type, + i.e., "mii" (default), "rmii", "gmii", "rgmii", "rgmii-id", "sgmii", + "tbi", or "rtbi". This property is only really needed if the connection + is of type "rgmii-id", as all other connection types are detected by + hardware. + Example: @@ -1687,7 +1757,7 @@ platforms are moved over to use the flattened-device-tree model. }; }; - g) Flash chip nodes + j) Flash chip nodes Flash chips (Memory Technology Devices) are often used for solid state file systems on embedded devices. @@ -1727,6 +1797,33 @@ platforms are moved over to use the flattened-device-tree model. partition-names = "fs\0firmware"; }; + k) Global Utilities Block + + The global utilities block controls power management, I/O device + enabling, power-on-reset configuration monitoring, general-purpose + I/O signal configuration, alternate function selection for multiplexed + signals, and clock control. + + Required properties: + + - compatible : Should define the compatible device type for + global-utilities. + - reg : Offset and length of the register set for the device. + + Recommended properties: + + - fsl,has-rstcr : Indicates that the global utilities register set + contains a functioning "reset control register" (i.e. the board + is wired to reset upon setting the HRESET_REQ bit in this register). + + Example: + + global-utilities@e0000 { /* global utilities block */ + compatible = "fsl,mpc8548-guts"; + reg = <e0000 1000>; + fsl,has-rstcr; + }; + More devices will be defined as this spec matures. VII - Specifying interrupt information for devices diff --git a/Documentation/rtc.txt b/Documentation/rtc.txt index 7c701b88d6d5..c931d613f641 100644 --- a/Documentation/rtc.txt +++ b/Documentation/rtc.txt @@ -385,7 +385,7 @@ test_PIE: /* not all RTCs support periodic IRQs */ if (errno == ENOTTY) { fprintf(stderr, "\nNo periodic IRQ support\n"); - return 0; + goto done; } perror("RTC_IRQP_READ ioctl"); exit(errno); diff --git a/Documentation/sched-design-CFS.txt b/Documentation/sched-design-CFS.txt new file mode 100644 index 000000000000..16feebb7bdc0 --- /dev/null +++ b/Documentation/sched-design-CFS.txt @@ -0,0 +1,119 @@ + +This is the CFS scheduler. + +80% of CFS's design can be summed up in a single sentence: CFS basically +models an "ideal, precise multi-tasking CPU" on real hardware. + +"Ideal multi-tasking CPU" is a (non-existent :-)) CPU that has 100% +physical power and which can run each task at precise equal speed, in +parallel, each at 1/nr_running speed. For example: if there are 2 tasks +running then it runs each at 50% physical power - totally in parallel. + +On real hardware, we can run only a single task at once, so while that +one task runs, the other tasks that are waiting for the CPU are at a +disadvantage - the current task gets an unfair amount of CPU time. In +CFS this fairness imbalance is expressed and tracked via the per-task +p->wait_runtime (nanosec-unit) value. "wait_runtime" is the amount of +time the task should now run on the CPU for it to become completely fair +and balanced. + +( small detail: on 'ideal' hardware, the p->wait_runtime value would + always be zero - no task would ever get 'out of balance' from the + 'ideal' share of CPU time. ) + +CFS's task picking logic is based on this p->wait_runtime value and it +is thus very simple: it always tries to run the task with the largest +p->wait_runtime value. In other words, CFS tries to run the task with +the 'gravest need' for more CPU time. So CFS always tries to split up +CPU time between runnable tasks as close to 'ideal multitasking +hardware' as possible. + +Most of the rest of CFS's design just falls out of this really simple +concept, with a few add-on embellishments like nice levels, +multiprocessing and various algorithm variants to recognize sleepers. + +In practice it works like this: the system runs a task a bit, and when +the task schedules (or a scheduler tick happens) the task's CPU usage is +'accounted for': the (small) time it just spent using the physical CPU +is deducted from p->wait_runtime. [minus the 'fair share' it would have +gotten anyway]. Once p->wait_runtime gets low enough so that another +task becomes the 'leftmost task' of the time-ordered rbtree it maintains +(plus a small amount of 'granularity' distance relative to the leftmost +task so that we do not over-schedule tasks and trash the cache) then the +new leftmost task is picked and the current task is preempted. + +The rq->fair_clock value tracks the 'CPU time a runnable task would have +fairly gotten, had it been runnable during that time'. So by using +rq->fair_clock values we can accurately timestamp and measure the +'expected CPU time' a task should have gotten. All runnable tasks are +sorted in the rbtree by the "rq->fair_clock - p->wait_runtime" key, and +CFS picks the 'leftmost' task and sticks to it. As the system progresses +forwards, newly woken tasks are put into the tree more and more to the +right - slowly but surely giving a chance for every task to become the +'leftmost task' and thus get on the CPU within a deterministic amount of +time. + +Some implementation details: + + - the introduction of Scheduling Classes: an extensible hierarchy of + scheduler modules. These modules encapsulate scheduling policy + details and are handled by the scheduler core without the core + code assuming about them too much. + + - sched_fair.c implements the 'CFS desktop scheduler': it is a + replacement for the vanilla scheduler's SCHED_OTHER interactivity + code. + + I'd like to give credit to Con Kolivas for the general approach here: + he has proven via RSDL/SD that 'fair scheduling' is possible and that + it results in better desktop scheduling. Kudos Con! + + The CFS patch uses a completely different approach and implementation + from RSDL/SD. My goal was to make CFS's interactivity quality exceed + that of RSDL/SD, which is a high standard to meet :-) Testing + feedback is welcome to decide this one way or another. [ and, in any + case, all of SD's logic could be added via a kernel/sched_sd.c module + as well, if Con is interested in such an approach. ] + + CFS's design is quite radical: it does not use runqueues, it uses a + time-ordered rbtree to build a 'timeline' of future task execution, + and thus has no 'array switch' artifacts (by which both the vanilla + scheduler and RSDL/SD are affected). + + CFS uses nanosecond granularity accounting and does not rely on any + jiffies or other HZ detail. Thus the CFS scheduler has no notion of + 'timeslices' and has no heuristics whatsoever. There is only one + central tunable: + + /proc/sys/kernel/sched_granularity_ns + + which can be used to tune the scheduler from 'desktop' (low + latencies) to 'server' (good batching) workloads. It defaults to a + setting suitable for desktop workloads. SCHED_BATCH is handled by the + CFS scheduler module too. + + Due to its design, the CFS scheduler is not prone to any of the + 'attacks' that exist today against the heuristics of the stock + scheduler: fiftyp.c, thud.c, chew.c, ring-test.c, massive_intr.c all + work fine and do not impact interactivity and produce the expected + behavior. + + the CFS scheduler has a much stronger handling of nice levels and + SCHED_BATCH: both types of workloads should be isolated much more + agressively than under the vanilla scheduler. + + ( another detail: due to nanosec accounting and timeline sorting, + sched_yield() support is very simple under CFS, and in fact under + CFS sched_yield() behaves much better than under any other + scheduler i have tested so far. ) + + - sched_rt.c implements SCHED_FIFO and SCHED_RR semantics, in a simpler + way than the vanilla scheduler does. It uses 100 runqueues (for all + 100 RT priority levels, instead of 140 in the vanilla scheduler) + and it needs no expired array. + + - reworked/sanitized SMP load-balancing: the runqueue-walking + assumptions are gone from the load-balancing code now, and + iterators of the scheduling modules are used. The balancing code got + quite a bit simpler as a result. + diff --git a/Documentation/scsi/aacraid.txt b/Documentation/scsi/aacraid.txt index ce3cb42507bd..cc12b55d4b3d 100644 --- a/Documentation/scsi/aacraid.txt +++ b/Documentation/scsi/aacraid.txt @@ -50,6 +50,9 @@ Supported Cards/Chipsets 9005:0285:9005:02be Adaptec 31605 (Marauder160) 9005:0285:9005:02c3 Adaptec 51205 (Voodoo120) 9005:0285:9005:02c4 Adaptec 51605 (Voodoo160) + 9005:0285:9005:02ce Adaptec 51245 (Voodoo124) + 9005:0285:9005:02cf Adaptec 51645 (Voodoo164) + 9005:0285:9005:02d0 Adaptec 52445 (Voodoo244) 1011:0046:9005:0364 Adaptec 5400S (Mustang) 9005:0287:9005:0800 Adaptec Themisto (Jupiter) 9005:0200:9005:0200 Adaptec Themisto (Jupiter) diff --git a/Documentation/scsi/scsi_fc_transport.txt b/Documentation/scsi/scsi_fc_transport.txt new file mode 100644 index 000000000000..d403e46d8463 --- /dev/null +++ b/Documentation/scsi/scsi_fc_transport.txt @@ -0,0 +1,450 @@ + SCSI FC Tansport + ============================================= + +Date: 4/12/2007 +Kernel Revisions for features: + rports : <<TBS>> + vports : 2.6.22 (? TBD) + + +Introduction +============ +This file documents the features and components of the SCSI FC Transport. +It also provides documents the API between the transport and FC LLDDs. +The FC transport can be found at: + drivers/scsi/scsi_transport_fc.c + include/scsi/scsi_transport_fc.h + include/scsi/scsi_netlink_fc.h + +This file is found at Documentation/scsi/scsi_fc_transport.txt + + +FC Remote Ports (rports) +======================================================================== +<< To Be Supplied >> + + +FC Virtual Ports (vports) +======================================================================== + +Overview: +------------------------------- + + New FC standards have defined mechanisms which allows for a single physical + port to appear on as multiple communication ports. Using the N_Port Id + Virtualization (NPIV) mechanism, a point-to-point connection to a Fabric + can be assigned more than 1 N_Port_ID. Each N_Port_ID appears as a + separate port to other endpoints on the fabric, even though it shares one + physical link to the switch for communication. Each N_Port_ID can have a + unique view of the fabric based on fabric zoning and array lun-masking + (just like a normal non-NPIV adapter). Using the Virtual Fabric (VF) + mechanism, adding a fabric header to each frame allows the port to + interact with the Fabric Port to join multiple fabrics. The port will + obtain an N_Port_ID on each fabric it joins. Each fabric will have its + own unique view of endpoints and configuration parameters. NPIV may be + used together with VF so that the port can obtain multiple N_Port_IDs + on each virtual fabric. + + The FC transport is now recognizing a new object - a vport. A vport is + an entity that has a world-wide unique World Wide Port Name (wwpn) and + World Wide Node Name (wwnn). The transport also allows for the FC4's to + be specified for the vport, with FCP_Initiator being the primary role + expected. Once instantiated by one of the above methods, it will have a + distinct N_Port_ID and view of fabric endpoints and storage entities. + The fc_host associated with the physical adapter will export the ability + to create vports. The transport will create the vport object within the + Linux device tree, and instruct the fc_host's driver to instantiate the + virtual port. Typically, the driver will create a new scsi_host instance + on the vport, resulting in a unique <H,C,T,L> namespace for the vport. + Thus, whether a FC port is based on a physical port or on a virtual port, + each will appear as a unique scsi_host with its own target and lun space. + + Note: At this time, the transport is written to create only NPIV-based + vports. However, consideration was given to VF-based vports and it + should be a minor change to add support if needed. The remaining + discussion will concentrate on NPIV. + + Note: World Wide Name assignment (and uniqueness guarantees) are left + up to an administrative entity controling the vport. For example, + if vports are to be associated with virtual machines, a XEN mgmt + utility would be responsible for creating wwpn/wwnn's for the vport, + using it's own naming authority and OUI. (Note: it already does this + for virtual MAC addresses). + + +Device Trees and Vport Objects: +------------------------------- + + Today, the device tree typically contains the scsi_host object, + with rports and scsi target objects underneath it. Currently the FC + transport creates the vport object and places it under the scsi_host + object corresponding to the physical adapter. The LLDD will allocate + a new scsi_host for the vport and link it's object under the vport. + The remainder of the tree under the vports scsi_host is the same + as the non-NPIV case. The transport is written currently to easily + allow the parent of the vport to be something other than the scsi_host. + This could be used in the future to link the object onto a vm-specific + device tree. If the vport's parent is not the physical port's scsi_host, + a symbolic link to the vport object will be placed in the physical + port's scsi_host. + + Here's what to expect in the device tree : + The typical Physical Port's Scsi_Host: + /sys/devices/.../host17/ + and it has the typical decendent tree: + /sys/devices/.../host17/rport-17:0-0/target17:0:0/17:0:0:0: + and then the vport is created on the Physical Port: + /sys/devices/.../host17/vport-17:0-0 + and the vport's Scsi_Host is then created: + /sys/devices/.../host17/vport-17:0-0/host18 + and then the rest of the tree progresses, such as: + /sys/devices/.../host17/vport-17:0-0/host18/rport-18:0-0/target18:0:0/18:0:0:0: + + Here's what to expect in the sysfs tree : + scsi_hosts: + /sys/class/scsi_host/host17 physical port's scsi_host + /sys/class/scsi_host/host18 vport's scsi_host + fc_hosts: + /sys/class/fc_host/host17 physical port's fc_host + /sys/class/fc_host/host18 vport's fc_host + fc_vports: + /sys/class/fc_vports/vport-17:0-0 the vport's fc_vport + fc_rports: + /sys/class/fc_remote_ports/rport-17:0-0 rport on the physical port + /sys/class/fc_remote_ports/rport-18:0-0 rport on the vport + + +Vport Attributes: +------------------------------- + + The new fc_vport class object has the following attributes + + node_name: Read_Only + The WWNN of the vport + + port_name: Read_Only + The WWPN of the vport + + roles: Read_Only + Indicates the FC4 roles enabled on the vport. + + symbolic_name: Read_Write + A string, appended to the driver's symbolic port name string, which + is registered with the switch to identify the vport. For example, + a hypervisor could set this string to "Xen Domain 2 VM 5 Vport 2", + and this set of identifiers can be seen on switch management screens + to identify the port. + + vport_delete: Write_Only + When written with a "1", will tear down the vport. + + vport_disable: Write_Only + When written with a "1", will transition the vport to a disabled. + state. The vport will still be instantiated with the Linux kernel, + but it will not be active on the FC link. + When written with a "0", will enable the vport. + + vport_last_state: Read_Only + Indicates the previous state of the vport. See the section below on + "Vport States". + + vport_state: Read_Only + Indicates the state of the vport. See the section below on + "Vport States". + + vport_type: Read_Only + Reflects the FC mechanism used to create the virtual port. + Only NPIV is supported currently. + + + For the fc_host class object, the following attributes are added for vports: + + max_npiv_vports: Read_Only + Indicates the maximum number of NPIV-based vports that the + driver/adapter can support on the fc_host. + + npiv_vports_inuse: Read_Only + Indicates how many NPIV-based vports have been instantiated on the + fc_host. + + vport_create: Write_Only + A "simple" create interface to instantiate a vport on an fc_host. + A "<WWPN>:<WWNN>" string is written to the attribute. The transport + then instantiates the vport object and calls the LLDD to create the + vport with the role of FCP_Initiator. Each WWN is specified as 16 + hex characters and may *not* contain any prefixes (e.g. 0x, x, etc). + + vport_delete: Write_Only + A "simple" delete interface to teardown a vport. A "<WWPN>:<WWNN>" + string is written to the attribute. The transport will locate the + vport on the fc_host with the same WWNs and tear it down. Each WWN + is specified as 16 hex characters and may *not* contain any prefixes + (e.g. 0x, x, etc). + + +Vport States: +------------------------------- + + Vport instantiation consists of two parts: + - Creation with the kernel and LLDD. This means all transport and + driver data structures are built up, and device objects created. + This is equivalent to a driver "attach" on an adapter, which is + independent of the adapter's link state. + - Instantiation of the vport on the FC link via ELS traffic, etc. + This is equivalent to a "link up" and successfull link initialization. + Futher information can be found in the interfaces section below for + Vport Creation. + + Once a vport has been instantiated with the kernel/LLDD, a vport state + can be reported via the sysfs attribute. The following states exist: + + FC_VPORT_UNKNOWN - Unknown + An temporary state, typically set only while the vport is being + instantiated with the kernel and LLDD. + + FC_VPORT_ACTIVE - Active + The vport has been successfully been created on the FC link. + It is fully functional. + + FC_VPORT_DISABLED - Disabled + The vport instantiated, but "disabled". The vport is not instantiated + on the FC link. This is equivalent to a physical port with the + link "down". + + FC_VPORT_LINKDOWN - Linkdown + The vport is not operational as the physical link is not operational. + + FC_VPORT_INITIALIZING - Initializing + The vport is in the process of instantiating on the FC link. + The LLDD will set this state just prior to starting the ELS traffic + to create the vport. This state will persist until the vport is + successfully created (state becomes FC_VPORT_ACTIVE) or it fails + (state is one of the values below). As this state is transitory, + it will not be preserved in the "vport_last_state". + + FC_VPORT_NO_FABRIC_SUPP - No Fabric Support + The vport is not operational. One of the following conditions were + encountered: + - The FC topology is not Point-to-Point + - The FC port is not connected to an F_Port + - The F_Port has indicated that NPIV is not supported. + + FC_VPORT_NO_FABRIC_RSCS - No Fabric Resources + The vport is not operational. The Fabric failed FDISC with a status + indicating that it does not have sufficient resources to complete + the operation. + + FC_VPORT_FABRIC_LOGOUT - Fabric Logout + The vport is not operational. The Fabric has LOGO'd the N_Port_ID + associated with the vport. + + FC_VPORT_FABRIC_REJ_WWN - Fabric Rejected WWN + The vport is not operational. The Fabric failed FDISC with a status + indicating that the WWN's are not valid. + + FC_VPORT_FAILED - VPort Failed + The vport is not operational. This is a catchall for all other + error conditions. + + + The following state table indicates the different state transitions: + + State Event New State + -------------------------------------------------------------------- + n/a Initialization Unknown + Unknown: Link Down Linkdown + Link Up & Loop No Fabric Support + Link Up & no Fabric No Fabric Support + Link Up & FLOGI response No Fabric Support + indicates no NPIV support + Link Up & FDISC being sent Initializing + Disable request Disable + Linkdown: Link Up Unknown + Initializing: FDISC ACC Active + FDISC LS_RJT w/ no resources No Fabric Resources + FDISC LS_RJT w/ invalid Fabric Rejected WWN + pname or invalid nport_id + FDISC LS_RJT failed for Vport Failed + other reasons + Link Down Linkdown + Disable request Disable + Disable: Enable request Unknown + Active: LOGO received from fabric Fabric Logout + Link Down Linkdown + Disable request Disable + Fabric Logout: Link still up Unknown + + The following 4 error states all have the same transitions: + No Fabric Support: + No Fabric Resources: + Fabric Rejected WWN: + Vport Failed: + Disable request Disable + Link goes down Linkdown + + +Transport <-> LLDD Interfaces : +------------------------------- + +Vport support by LLDD: + + The LLDD indicates support for vports by supplying a vport_create() + function in the transport template. The presense of this function will + cause the creation of the new attributes on the fc_host. As part of + the physical port completing its initialization relative to the + transport, it should set the max_npiv_vports attribute to indicate the + maximum number of vports the driver and/or adapter supports. + + +Vport Creation: + + The LLDD vport_create() syntax is: + + int vport_create(struct fc_vport *vport, bool disable) + + where: + vport: Is the newly allocated vport object + disable: If "true", the vport is to be created in a disabled stated. + If "false", the vport is to be enabled upon creation. + + When a request is made to create a new vport (via sgio/netlink, or the + vport_create fc_host attribute), the transport will validate that the LLDD + can support another vport (e.g. max_npiv_vports > npiv_vports_inuse). + If not, the create request will be failed. If space remains, the transport + will increment the vport count, create the vport object, and then call the + LLDD's vport_create() function with the newly allocated vport object. + + As mentioned above, vport creation is divided into two parts: + - Creation with the kernel and LLDD. This means all transport and + driver data structures are built up, and device objects created. + This is equivalent to a driver "attach" on an adapter, which is + independent of the adapter's link state. + - Instantiation of the vport on the FC link via ELS traffic, etc. + This is equivalent to a "link up" and successfull link initialization. + + The LLDD's vport_create() function will not synchronously wait for both + parts to be fully completed before returning. It must validate that the + infrastructure exists to support NPIV, and complete the first part of + vport creation (data structure build up) before returning. We do not + hinge vport_create() on the link-side operation mainly because: + - The link may be down. It is not a failure if it is. It simply + means the vport is in an inoperable state until the link comes up. + This is consistent with the link bouncing post vport creation. + - The vport may be created in a disabled state. + - This is consistent with a model where: the vport equates to a + FC adapter. The vport_create is synonymous with driver attachment + to the adapter, which is independent of link state. + + Note: special error codes have been defined to delineate infrastructure + failure cases for quicker resolution. + + The expected behavior for the LLDD's vport_create() function is: + - Validate Infrastructure: + - If the driver or adapter cannot support another vport, whether + due to improper firmware, (a lie about) max_npiv, or a lack of + some other resource - return VPCERR_UNSUPPORTED. + - If the driver validates the WWN's against those already active on + the adapter and detects an overlap - return VPCERR_BAD_WWN. + - If the driver detects the topology is loop, non-fabric, or the + FLOGI did not support NPIV - return VPCERR_NO_FABRIC_SUPP. + - Allocate data structures. If errors are encountered, such as out + of memory conditions, return the respective negative Exxx error code. + - If the role is FCP Initiator, the LLDD is to : + - Call scsi_host_alloc() to allocate a scsi_host for the vport. + - Call scsi_add_host(new_shost, &vport->dev) to start the scsi_host + and bind it as a child of the vport device. + - Initializes the fc_host attribute values. + - Kick of further vport state transitions based on the disable flag and + link state - and return success (zero). + + LLDD Implementers Notes: + - It is suggested that there be a different fc_function_templates for + the physical port and the virtual port. The physical port's template + would have the vport_create, vport_delete, and vport_disable functions, + while the vports would not. + - It is suggested that there be different scsi_host_templates + for the physical port and virtual port. Likely, there are driver + attributes, embedded into the scsi_host_template, that are applicable + for the physical port only (link speed, topology setting, etc). This + ensures that the attributes are applicable to the respective scsi_host. + + +Vport Disable/Enable: + + The LLDD vport_disable() syntax is: + + int vport_disable(struct fc_vport *vport, bool disable) + + where: + vport: Is vport to to be enabled or disabled + disable: If "true", the vport is to be disabled. + If "false", the vport is to be enabled. + + When a request is made to change the disabled state on a vport, the + transport will validate the request against the existing vport state. + If the request is to disable and the vport is already disabled, the + request will fail. Similarly, if the request is to enable, and the + vport is not in a disabled state, the request will fail. If the request + is valid for the vport state, the transport will call the LLDD to + change the vport's state. + + Within the LLDD, if a vport is disabled, it remains instantiated with + the kernel and LLDD, but it is not active or visible on the FC link in + any way. (see Vport Creation and the 2 part instantiation discussion). + The vport will remain in this state until it is deleted or re-enabled. + When enabling a vport, the LLDD reinstantiates the vport on the FC + link - essentially restarting the LLDD statemachine (see Vport States + above). + + +Vport Deletion: + + The LLDD vport_delete() syntax is: + + int vport_delete(struct fc_vport *vport) + + where: + vport: Is vport to delete + + When a request is made to delete a vport (via sgio/netlink, or via the + fc_host or fc_vport vport_delete attributes), the transport will call + the LLDD to terminate the vport on the FC link, and teardown all other + datastructures and references. If the LLDD completes successfully, + the transport will teardown the vport objects and complete the vport + removal. If the LLDD delete request fails, the vport object will remain, + but will be in an indeterminate state. + + Within the LLDD, the normal code paths for a scsi_host teardown should + be followed. E.g. If the vport has a FCP Initiator role, the LLDD + will call fc_remove_host() for the vports scsi_host, followed by + scsi_remove_host() and scsi_host_put() for the vports scsi_host. + + +Other: + fc_host port_type attribute: + There is a new fc_host port_type value - FC_PORTTYPE_NPIV. This value + must be set on all vport-based fc_hosts. Normally, on a physical port, + the port_type attribute would be set to NPORT, NLPORT, etc based on the + topology type and existence of the fabric. As this is not applicable to + a vport, it makes more sense to report the FC mechanism used to create + the vport. + + Driver unload: + FC drivers are required to call fc_remove_host() prior to calling + scsi_remove_host(). This allows the fc_host to tear down all remote + ports prior the scsi_host being torn down. The fc_remove_host() call + was updated to remove all vports for the fc_host as well. + + +Credits +======= +The following people have contributed to this document: + + + + + + +James Smart +james.smart@emulex.com + diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index 355ff0a2bb7c..241e26c4ff92 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -467,7 +467,12 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. above explicitly. The power-management is supported. - + + Module snd-cs5530 + _________________ + + Module for Cyrix/NatSemi Geode 5530 chip. + Module snd-cs5535audio ---------------------- @@ -759,6 +764,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. model - force the model name position_fix - Fix DMA pointer (0 = auto, 1 = none, 2 = POSBUF, 3 = FIFO size) + probe_mask - Bitmask to probe codecs (default = -1, meaning all slots) single_cmd - Use single immediate commands to communicate with codecs (for debugging only) enable_msi - Enable Message Signaled Interrupt (MSI) (default = off) @@ -803,6 +809,8 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. hp-3013 HP machines (3013-variant) fujitsu Fujitsu S7020 acer Acer TravelMate + will Will laptops (PB V7900) + replacer Replacer 672V basic fixed pin assignment (old default model) auto auto-config reading BIOS (default) @@ -811,16 +819,31 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. hp-bpc HP xw4400/6400/8400/9400 laptops hp-bpc-d7000 HP BPC D7000 benq Benq ED8 + benq-t31 Benq T31 hippo Hippo (ATI) with jack detection, Sony UX-90s hippo_1 Hippo (Benq) with jack detection + sony-assamd Sony ASSAMD basic fixed pin assignment w/o SPDIF auto auto-config reading BIOS (default) + ALC268 + 3stack 3-stack model + auto auto-config reading BIOS (default) + + ALC662 + 3stack-dig 3-stack (2-channel) with SPDIF + 3stack-6ch 3-stack (6-channel) + 3stack-6ch-dig 3-stack (6-channel) with SPDIF + 6stack-dig 6-stack with SPDIF + lenovo-101e Lenovo laptop + auto auto-config reading BIOS (default) + ALC882/885 3stack-dig 3-jack with SPDIF I/O 6stack-dig 6-jack digital with SPDIF I/O arima Arima W820Di1 macpro MacPro support + imac24 iMac 24'' with jack detection w2jc ASUS W2JC auto auto-config reading BIOS (default) @@ -832,9 +855,15 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. 6stack-dig-demo 6-jack digital for Intel demo board acer Acer laptops (Travelmate 3012WTMi, Aspire 5600, etc) medion Medion Laptops + medion-md2 Medion MD2 targa-dig Targa/MSI targa-2ch-dig Targs/MSI with 2-channel laptop-eapd 3-jack with SPDIF I/O and EAPD (Clevo M540JE, M550JE) + lenovo-101e Lenovo 101E + lenovo-nb0763 Lenovo NB0763 + lenovo-ms7195-dig Lenovo MS7195 + 6stack-hp HP machines with 6stack (Nettle boards) + 3stack-hp HP machines with 3stack (Lucknow, Samba boards) auto auto-config reading BIOS (default) ALC861/660 @@ -853,7 +882,9 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. 3stack-dig 3-jack with SPDIF OUT 6stack-dig 6-jack with SPDIF OUT 3stack-660 3-jack (for ALC660VD) + 3stack-660-digout 3-jack with SPDIF OUT (for ALC660VD) lenovo Lenovo 3000 C200 + dallas Dallas laptops auto auto-config reading BIOS (default) CMI9880 @@ -864,12 +895,26 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. allout 5-jack in back, 2-jack in front, SPDIF out auto auto-config reading BIOS (default) + AD1882 + 3stack 3-stack mode (default) + 6stack 6-stack mode + + AD1884 + N/A + AD1981 basic 3-jack (default) hp HP nx6320 thinkpad Lenovo Thinkpad T60/X60/Z60 toshiba Toshiba U205 + AD1983 + N/A + + AD1984 + basic default configuration + thinkpad Lenovo Thinkpad T61/X61 + AD1986A 6stack 6-jack, separate surrounds (default) 3stack 3-stack, shared surrounds @@ -907,11 +952,18 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. ref Reference board 3stack D945 3stack 5stack D945 5stack + SPDIF - macmini Intel Mac Mini - macbook Intel Mac Book - macbook-pro-v1 Intel Mac Book Pro 1st generation - macbook-pro Intel Mac Book Pro 2nd generation - imac-intel Intel iMac + dell Dell XPS M1210 + intel-mac-v1 Intel Mac Type 1 + intel-mac-v2 Intel Mac Type 2 + intel-mac-v3 Intel Mac Type 3 + intel-mac-v4 Intel Mac Type 4 + intel-mac-v5 Intel Mac Type 5 + macmini Intel Mac Mini (equivalent with type 3) + macbook Intel Mac Book (eq. type 5) + macbook-pro-v1 Intel Mac Book Pro 1st generation (eq. type 3) + macbook-pro Intel Mac Book Pro 2nd generation (eq. type 3) + imac-intel Intel iMac (eq. type 2) + imac-intel-20 Intel iMac (newer version) (eq. type 3) STAC9202/9250/9251 ref Reference board, base config @@ -956,6 +1008,17 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. from the irq. Remember this is a last resort, and should be avoided as much as possible... + MORE NOTES ON "azx_get_response timeout" PROBLEMS: + On some hardwares, you may need to add a proper probe_mask option + to avoid the "azx_get_response timeout" problem above, instead. + This occurs when the access to non-existing or non-working codec slot + (likely a modem one) causes a stall of the communication via HD-audio + bus. You can see which codec slots are probed by enabling + CONFIG_SND_DEBUG_DETECT, or simply from the file name of the codec + proc files. Then limit the slots to probe by probe_mask option. + For example, probe_mask=1 means to probe only the first slot, and + probe_mask=4 means only the third slot. + The power-management is supported. Module snd-hdsp diff --git a/Documentation/sound/alsa/Audiophile-Usb.txt b/Documentation/sound/alsa/Audiophile-Usb.txt index e40cce83327c..2ad5e6306c44 100644 --- a/Documentation/sound/alsa/Audiophile-Usb.txt +++ b/Documentation/sound/alsa/Audiophile-Usb.txt @@ -1,4 +1,4 @@ - Guide to using M-Audio Audiophile USB with ALSA and Jack v1.3 + Guide to using M-Audio Audiophile USB with ALSA and Jack v1.5 ======================================================== Thibault Le Meur <Thibault.LeMeur@supelec.fr> @@ -6,8 +6,19 @@ This document is a guide to using the M-Audio Audiophile USB (tm) device with ALSA and JACK. +History +======= +* v1.4 - Thibault Le Meur (2007-07-11) + - Added Low Endianness nature of 16bits-modes + found by Hakan Lennestal <Hakan.Lennestal@brfsodrahamn.se> + - Modifying document structure +* v1.5 - Thibault Le Meur (2007-07-12) + - Added AC3/DTS passthru info + + 1 - Audiophile USB Specs and correct usage ========================================== + This part is a reminder of important facts about the functions and limitations of the device. @@ -25,18 +36,18 @@ The device has 4 audio interfaces, and 2 MIDI ports: The internal DAC/ADC has the following characteristics: * sample depth of 16 or 24 bits * sample rate from 8kHz to 96kHz -* Two ports can't use different sample depths at the same time. Moreover, the -Audiophile USB documentation gives the following Warning: "Please exit any -audio application running before switching between bit depths" +* Two interfaces can't use different sample depths at the same time. +Moreover, the Audiophile USB documentation gives the following Warning: +"Please exit any audio application running before switching between bit depths" Due to the USB 1.1 bandwidth limitation, a limited number of interfaces can be activated at the same time depending on the audio mode selected: - * 16-bit/48kHz ==> 4 channels in/4 channels out + * 16-bit/48kHz ==> 4 channels in + 4 channels out - Ai+Ao+Di+Do - * 24-bit/48kHz ==> 4 channels in/2 channels out, - or 2 channels in/4 channels out + * 24-bit/48kHz ==> 4 channels in + 2 channels out, + or 2 channels in + 4 channels out - Ai+Ao+Do or Ai+Di+Ao or Ai+Di+Do or Di+Ao+Do - * 24-bit/96kHz ==> 2 channels in, or 2 channels out (half duplex only) + * 24-bit/96kHz ==> 2 channels in _or_ 2 channels out (half duplex only) - Ai or Ao or Di or Do Important facts about the Digital interface: @@ -52,44 +63,56 @@ source is connected synchronization error (for instance sound played at an odd sample rate) -2 - Audiophile USB support in ALSA -================================== +2 - Audiophile USB MIDI support in ALSA +======================================= -2.1 - MIDI ports ----------------- -The Audiophile USB MIDI ports will be automatically supported once the +The Audiophile USB MIDI ports will be automatically supported once the following modules have been loaded: * snd-usb-audio * snd-seq-midi No additional setting is required. -2.2 - Audio ports ------------------ + +3 - Audiophile USB Audio support in ALSA +======================================== Audio functions of the Audiophile USB device are handled by the snd-usb-audio module. This module can work in a default mode (without any device-specific parameter), or in an "advanced" mode with the device-specific parameter called "device_setup". -2.2.1 - Default Alsa driver mode - -The default behavior of the snd-usb-audio driver is to parse the device -capabilities at startup and enable all functions inside the device (including -all ports at any supported sample rates and sample depths). This approach -has the advantage to let the driver easily switch from sample rates/depths -automatically according to the need of the application claiming the device. - -In this case the Audiophile ports are mapped to alsa pcm devices in the -following way (I suppose the device's index is 1): +3.1 - Default Alsa driver mode +------------------------------ + +The default behavior of the snd-usb-audio driver is to list the device +capabilities at startup and activate the required mode when required +by the applications: for instance if the user is recording in a +24bit-depth-mode and immediately after wants to switch to a 16bit-depth mode, +the snd-usb-audio module will reconfigure the device on the fly. + +This approach has the advantage to let the driver automatically switch from sample +rates/depths automatically according to the user's needs. However, those who +are using the device under windows know that this is not how the device is meant to +work: under windows applications must be closed before using the m-audio control +panel to switch the device working mode. Thus as we'll see in next section, this +Default Alsa driver mode can lead to device misconfigurations. + +Let's get back to the Default Alsa driver mode for now. In this case the +Audiophile interfaces are mapped to alsa pcm devices in the following +way (I suppose the device's index is 1): * hw:1,0 is Ao in playback and Di in capture * hw:1,1 is Do in playback and Ai in capture * hw:1,2 is Do in AC3/DTS passthrough mode -You must note as well that the device uses Big Endian byte encoding so that -supported audio format are S16_BE for 16-bit depth modes and S24_3BE for -24-bits depth mode. One exception is the hw:1,2 port which is Little Endian -compliant and thus uses S16_LE. +In this mode, the device uses Big Endian byte-encoding so that +supported audio format are S16_BE for 16-bit depth modes and S24_3BE for +24-bits depth mode. + +One exception is the hw:1,2 port which was reported to be Little Endian +compliant (supposedly supporting S16_LE) but processes in fact only S16_BE streams. +This has been fixed in kernel 2.6.23 and above and now the hw:1,2 interface +is reported to be big endian in this default driver mode. Examples: * playing a S24_3BE encoded raw file to the Ao port @@ -98,22 +121,26 @@ Examples: % arecord -D hw:1,1 -c2 -t raw -r48000 -fS24_3BE test.raw * playing a S16_BE encoded raw file to the Do port % aplay -D hw:1,1 -c2 -t raw -r48000 -fS16_BE test.raw + * playing an ac3 sample file to the Do port + % aplay -D hw:1,2 --channels=6 ac3_S16_BE_encoded_file.raw -If you're happy with the default Alsa driver setup and don't experience any +If you're happy with the default Alsa driver mode and don't experience any issue with this mode, then you can skip the following chapter. -2.2.2 - Advanced module setup +3.2 - Advanced module setup +--------------------------- Due to the hardware constraints described above, the device initialization made by the Alsa driver in default mode may result in a corrupted state of the device. For instance, a particularly annoying issue is that the sound captured -from the Ai port sounds distorted (as if boosted with an excessive high volume -gain). +from the Ai interface sounds distorted (as if boosted with an excessive high +volume gain). For people having this problem, the snd-usb-audio module has a new module -parameter called "device_setup". +parameter called "device_setup" (this parameter was introduced in kernel +release 2.6.17) -2.2.2.1 - Initializing the working mode of the Audiophile USB +3.2.1 - Initializing the working mode of the Audiophile USB As far as the Audiophile USB device is concerned, this value let the user specify: @@ -121,33 +148,57 @@ specify: * the sample rate * whether the Di port is used or not -Here is a list of supported device_setup values for this device: - * device_setup=0x00 (or omitted) - - Alsa driver default mode - - maintains backward compatibility with setups that do not use this - parameter by not introducing any change - - results sometimes in corrupted sound as described earlier +When initialized with "device_setup=0x00", the snd-usb-audio module has +the same behaviour as when the parameter is omitted (see paragraph "Default +Alsa driver mode" above) + +Others modes are described in the following subsections. + +3.2.1.1 - 16-bit modes + +The two supported modes are: + * device_setup=0x01 - 16bits 48kHz mode with Di disabled - Ai,Ao,Do can be used at the same time - hw:1,0 is not available in capture mode - hw:1,2 is not available + * device_setup=0x11 - 16bits 48kHz mode with Di enabled - Ai,Ao,Di,Do can be used at the same time - hw:1,0 is available in capture mode - hw:1,2 is not available + +In this modes the device operates only at 16bits-modes. Before kernel 2.6.23, +the devices where reported to be Big-Endian when in fact they were Little-Endian +so that playing a file was a matter of using: + % aplay -D hw:1,1 -c2 -t raw -r48000 -fS16_BE test_S16_LE.raw +where "test_S16_LE.raw" was in fact a little-endian sample file. + +Thanks to Hakan Lennestal (who discovered the Little-Endiannes of the device in +these modes) a fix has been committed (expected in kernel 2.6.23) and +Alsa now reports Little-Endian interfaces. Thus playing a file now is as simple as +using: + % aplay -D hw:1,1 -c2 -t raw -r48000 -fS16_LE test_S16_LE.raw + +3.2.1.2 - 24-bit modes + +The three supported modes are: + * device_setup=0x09 - 24bits 48kHz mode with Di disabled - Ai,Ao,Do can be used at the same time - hw:1,0 is not available in capture mode - hw:1,2 is not available + * device_setup=0x19 - 24bits 48kHz mode with Di enabled - 3 ports from {Ai,Ao,Di,Do} can be used at the same time - hw:1,0 is available in capture mode and an active digital source must be connected to Di - hw:1,2 is not available + * device_setup=0x0D or 0x10 - 24bits 96kHz mode - Di is enabled by default for this mode but does not need to be connected @@ -155,34 +206,64 @@ Here is a list of supported device_setup values for this device: - Only 1 port from {Ai,Ao,Di,Do} can be used at the same time - hw:1,0 is available in captured mode - hw:1,2 is not available + +In these modes the device is only Big-Endian compliant (see "Default Alsa driver +mode" above for an aplay command example) + +3.2.1.3 - AC3 w/ DTS passthru mode + +Thanks to Hakan Lennestal, I now have a report saying that this mode works. + * device_setup=0x03 - 16bits 48kHz mode with only the Do port enabled - - AC3 with DTS passthru (not tested) + - AC3 with DTS passthru - Caution with this setup the Do port is mapped to the pcm device hw:1,0 -2.2.2.2 - Setting and switching configurations with the device_setup parameter +The command line used to playback the AC3/DTS encoded .wav-files in this mode: + % aplay -D hw:1,0 --channels=6 ac3_S16_LE_encoded_file.raw + +3.2.2 - How to use the device_setup parameter +---------------------------------------------- The parameter can be given: + * By manually probing the device (as root): # modprobe -r snd-usb-audio # modprobe snd-usb-audio index=1 device_setup=0x09 + * Or while configuring the modules options in your modules configuration file - For Fedora distributions, edit the /etc/modprobe.conf file: alias snd-card-1 snd-usb-audio options snd-usb-audio index=1 device_setup=0x09 -IMPORTANT NOTE WHEN SWITCHING CONFIGURATION: -------------------------------------------- - * You may need to _first_ initialize the module with the correct device_setup - parameter and _only_after_ turn on the Audiophile USB device - * This is especially true when switching the sample depth: +CAUTION when initializaing the device +------------------------------------- + + * Correct initialization on the device requires that device_setup is given to + the module BEFORE the device is turned on. So, if you use the "manual probing" + method described above, take care to power-on the device AFTER this initialization. + + * Failing to respect this will lead in a misconfiguration of the device. In this case + turn off the device, unproble the snd-usb-audio module, then probe it again with + correct device_setup parameter and then (and only then) turn on the device again. + + * If you've correctly initialized the device in a valid mode and then want to switch + to another mode (possibly with another sample-depth), please use also the following + procedure: - first turn off the device - de-register the snd-usb-audio module (modprobe -r) - change the device_setup parameter by changing the device_setup option in /etc/modprobe.conf - turn on the device + * A workaround for this last issue has been applied to kernel 2.6.23, but it may not + be enough to ensure the 'stability' of the device initialization. -2.2.2.3 - Audiophile USB's device_setup structure +3.2.3 - Technical details for hackers +------------------------------------- +This section is for hackers, wanting to understand details about the device +internals and how Alsa supports it. + +3.2.3.1 - Audiophile USB's device_setup structure If you want to understand the device_setup magic numbers for the Audiophile USB, you need some very basic understanding of binary computation. However, @@ -228,12 +309,12 @@ Caution: - choosing b2 will prepare all interfaces for 24bits/96kHz but you'll only be able to use one at the same time -2.2.3 - USB implementation details for this device +3.2.3.2 - USB implementation details for this device You may safely skip this section if you're not interested in driver -development. +hacking. -This section describes some internal aspects of the device and summarize the +This section describes some internal aspects of the device and summarizes the data I got by usb-snooping the windows and Linux drivers. The M-Audio Audiophile USB has 7 USB Interfaces: @@ -293,43 +374,45 @@ parse_audio_endpoints function uses a quirk called "audiophile_skip_setting_quirk" in order to prevent AltSettings not corresponding to device_setup from being registered in the driver. -3 - Audiophile USB and Jack support +4 - Audiophile USB and Jack support =================================== This section deals with support of the Audiophile USB device in Jack. -The main issue regarding this support is that the device is Big Endian -compliant. -3.1 - Using the plug alsa plugin --------------------------------- +There are 2 main potential issues when using Jackd with the device: +* support for Big-Endian devices in 24-bit modes +* support for 4-in / 4-out channels + +4.1 - Direct support in Jackd +----------------------------- -Jack doesn't directly support big endian devices. Thus, one way to have support -for this device with Alsa is to use the Alsa "plug" converter. +Jack supports big endian devices only in recent versions (thanks to +Andreas Steinmetz for his first big-endian patch). I can't remember +extacly when this support was released into jackd, let's just say that +with jackd version 0.103.0 it's almost ok (just a small bug is affecting +16bits Big-Endian devices, but since you've read carefully the above +paragraphs, you're now using kernel >= 2.6.23 and your 16bits devices +are now Little Endians ;-) ). + +You can run jackd with the following command for playback with Ao and +record with Ai: + % jackd -R -dalsa -Phw:1,0 -r48000 -p128 -n2 -D -Chw:1,1 + +4.2 - Using Alsa plughw +----------------------- +If you don't have a recent Jackd installed, you can downgrade to using +the Alsa "plug" converter. For instance here is one way to run Jack with 2 playback channels on Ao and 2 capture channels from Ai: % jackd -R -dalsa -dplughw:1 -r48000 -p256 -n2 -D -Cplughw:1,1 - However you may see the following warning message: "You appear to be using the ALSA software "plug" layer, probably a result of using the "default" ALSA device. This is less efficient than it could be. Consider using a hardware device instead rather than using the plug layer." -3.2 - Patching alsa to use direct pcm device --------------------------------------------- -A patch for Jack by Andreas Steinmetz adds support for Big Endian devices. -However it has not been included in the CVS tree. - -You can find it at the following URL: -http://sourceforge.net/tracker/index.php?func=detail&aid=1289682&group_id=39687& -atid=425939 - -After having applied the patch you can run jackd with the following command -line: - % jackd -R -dalsa -Phw:1,0 -r48000 -p128 -n2 -D -Chw:1,1 - -3.2 - Getting 2 input and/or output interfaces in Jack +4.3 - Getting 2 input and/or output interfaces in Jack ------------------------------------------------------ As you can see, starting the Jack server this way will only enable 1 stereo @@ -339,6 +422,7 @@ This is due to the following restrictions: * Jack can only open one capture device and one playback device at a time * The Audiophile USB is seen as 2 (or three) Alsa devices: hw:1,0, hw:1,1 (and optionally hw:1,2) + If you want to get Ai+Di and/or Ao+Do support with Jack, you would need to combine the Alsa devices into one logical "complex" device. @@ -348,13 +432,11 @@ It is related to another device (ice1712) but can be adapted to suit the Audiophile USB. Enabling multiple Audiophile USB interfaces for Jackd will certainly require: -* patching Jack with the previously mentioned "Big Endian" patch -* patching Jackd with the MMAP_COMPLEX patch (see the ice1712 page) -* patching the alsa-lib/src/pcm/pcm_multi.c file (see the ice1712 page) +* Making sure your Jackd version has the MMAP_COMPLEX patch (see the ice1712 page) +* (maybe) patching the alsa-lib/src/pcm/pcm_multi.c file (see the ice1712 page) * define a multi device (combination of hw:1,0 and hw:1,1) in your .asoundrc file * start jackd with this device -I had no success in testing this for now, but this may be due to my OS -configuration. If you have any success with this kind of setup, please -drop me an email. +I had no success in testing this for now, if you have any success with this kind +of setup, please drop me an email. diff --git a/Documentation/sound/alsa/OSS-Emulation.txt b/Documentation/sound/alsa/OSS-Emulation.txt index ec2a02541d5b..bfa0c9aacb4b 100644 --- a/Documentation/sound/alsa/OSS-Emulation.txt +++ b/Documentation/sound/alsa/OSS-Emulation.txt @@ -278,6 +278,21 @@ current mixer configuration by reading and writing the whole file image. +Duplex Streams +============== + +Note that when attempting to use a single device file for playback and +capture, the OSS API provides no way to set the format, sample rate or +number of channels different in each direction. Thus + io_handle = open("device", O_RDWR) +will only function correctly if the values are the same in each direction. + +To use different values in the two directions, use both + input_handle = open("device", O_RDONLY) + output_handle = open("device", O_WRONLY) +and set the values for the corresponding handle. + + Unsupported Features ==================== diff --git a/Documentation/sound/oss/AD1816 b/Documentation/sound/oss/AD1816 deleted file mode 100644 index 14bd8f25d523..000000000000 --- a/Documentation/sound/oss/AD1816 +++ /dev/null @@ -1,84 +0,0 @@ -Documentation for the AD1816(A) sound driver -============================================ - -Installation: -------------- - -To get your AD1816(A) based sound card work, you'll have to enable support for -experimental code ("Prompt for development and/or incomplete code/drivers") -and isapnp ("Plug and Play support", "ISA Plug and Play support"). Enable -"Sound card support", "OSS modules support" and "Support for AD1816(A) based -cards (EXPERIMENTAL)" in the sound configuration menu, too. Now build, install -and reboot the new kernel as usual. - -Features: ---------- - -List of features supported by this driver: -- full-duplex support -- supported audio formats: unsigned 8bit, signed 16bit little endian, - signed 16bit big endian, -law, A-law -- supported channels: mono and stereo -- supported recording sources: Master, CD, Line, Line1, Line2, Mic -- supports phat 3d stereo circuit (Line 3) - - -Supported cards: ----------------- - -The following cards are known to work with this driver: -- Terratec Base 1 -- Terratec Base 64 -- HP Kayak -- Acer FX-3D -- SY-1816 -- Highscreen Sound-Boostar 32 Wave 3D -- Highscreen Sound-Boostar 16 -- AVM Apex Pro card -- (Aztech SC-16 3D) -- (Newcom SC-16 3D) -- (Terratec EWS64S) - -Cards listed in brackets are not supported reliable. If you have such a card -you should add the extra parameter: - options=1 -when loading the ad1816 module via modprobe. - - -Troubleshooting: ----------------- - -First of all you should check, if the driver has been loaded -properly. - -If loading of the driver succeeds, but playback/capture fails, check -if you used the correct values for irq, dma and dma2 when loading the module. -If one of them is wrong you usually get the following error message: - -Nov 6 17:06:13 tek01 kernel: Sound: DMA (output) timed out - IRQ/DRQ config error? - -If playback/capture is too fast or to slow, you should have a look at -the clock chip of your sound card. The AD1816 was designed for a 33MHz -oscillator, however most sound card manufacturer use slightly -different oscillators as they are cheaper than 33MHz oscillators. If -you have such a card you have to adjust the ad1816_clockfreq parameter -above. For example: For a card using a 32.875MHz oscillator use -ad1816_clockfreq=32875 instead of ad1816_clockfreq=33000. - - -Updates, bugfixes and bugreports: --------------------------------- - -As the driver is still experimental and under development, you should -watch out for updates. Updates of the driver are available on the -Internet from one of my home pages: - http://www.student.informatik.tu-darmstadt.de/~tek/projects/linux.html -or: - http://www.tu-darmstadt.de/~tek01/projects/linux.html - -Bugreports, bugfixes and related questions should be sent via E-Mail to: - tek@rbg.informatik.tu-darmstadt.de - -Thorsten Knabe <tek@rbg.informatik.tu-darmstadt.de> -Christoph Hellwig <hch@infradead.org> - Last modified: 2000/09/20 diff --git a/Documentation/sound/oss/NM256 b/Documentation/sound/oss/NM256 deleted file mode 100644 index b503217488b3..000000000000 --- a/Documentation/sound/oss/NM256 +++ /dev/null @@ -1,280 +0,0 @@ -======================================================= -Documentation for the NeoMagic 256AV/256ZX sound driver -======================================================= - -You're looking at version 1.1 of the driver. (Woohoo!) It has been -successfully tested against the following laptop models: - - Sony Z505S/Z505SX/Z505DX/Z505RX - Sony F150, F160, F180, F250, F270, F280, PCG-F26 - Dell Latitude CPi, CPt (various submodels) - -There are a few caveats, which is why you should read the entirety of -this document first. - -This driver was developed without any support or assistance from -NeoMagic. There is no warranty, expressed, implied, or otherwise. It -is free software in the public domain; feel free to use it, sell it, -give it to your best friends, even claim that you wrote it (but why?!) -but don't go whining to me, NeoMagic, Sony, Dell, or anyone else -when it blows up your computer. - -Version 1.1 contains a change to try and detect non-AC97 versions of -the hardware, and not install itself appropriately. It should also -reinitialize the hardware on an APM resume event, assuming that APM -was configured into your kernel. - -============ -Installation -============ - -Enable the sound drivers, the OSS sound drivers, and then the NM256 -driver. The NM256 driver *must* be configured as a module (it won't -give you any other choice). - -Next, do the usual "make modules" and "make modules_install". -Finally, insmod the soundcore, sound and nm256 modules. - -When the nm256 driver module is loaded, you should see a couple of -confirmation messages in the kernel logfile indicating that it found -the device (the device does *not* use any I/O ports or DMA channels). -Now try playing a wav file, futz with the CD-ROM if you have one, etc. - -The NM256 is entirely a PCI-based device, and all the necessary -information is automatically obtained from the card. It can only be -configured as a module in a vain attempt to prevent people from -hurting themselves. It works correctly if it shares an IRQ with -another device (it normally shares IRQ 9 with the builtin eepro100 -ethernet on the Sony Z505 laptops). - -It does not run the card in any sort of compatibility mode. It will -not work on laptops that have the SB16-compatible, AD1848-compatible -or CS4232-compatible codec/mixer; you will want to use the appropriate -compatible OSS driver with these chipsets. I cannot provide any -assistance with machines using the SB16, AD1848 or CS4232 compatible -versions. (The driver now attempts to detect the mixer version, and -will refuse to load if it believes the hardware is not -AC97-compatible.) - -The sound support is very basic, but it does include simultaneous -playback and record capability. The mixer support is also quite -simple, although this is in keeping with the rather limited -functionality of the chipset. - -There is no hardware synthesizer available, as the Losedows OPL-3 and -MIDI support is done via hardware emulation. - -Only three recording devices are available on the Sony: the -microphone, the CD-ROM input, and the volume device (which corresponds -to the stereo output). (Other devices may be available on other -models of laptops.) The Z505 series does not have a builtin CD-ROM, -so of course the CD-ROM input doesn't work. It does work on laptops -with a builtin CD-ROM drive. - -The mixer device does not appear to have any tone controls, at least -on the Z505 series. The mixer module checks for tone controls in the -AC97 mixer, and will enable them if they are available. - -============== -Known problems -============== - - * There are known problems with PCMCIA cards and the eepro100 ethernet - driver on the Z505S/Z505SX/Z505DX. Keep reading. - - * There are also potential problems with using a virtual X display, and - also problems loading the module after the X server has been started. - Keep reading. - - * The volume control isn't anywhere near linear. Sorry. This will be - fixed eventually, when I get sufficiently annoyed with it. (I doubt - it will ever be fixed now, since I've never gotten sufficiently - annoyed with it and nobody else seems to care.) - - * There are reports that the CD-ROM volume is very low. Since I do not - have a CD-ROM equipped laptop, I cannot test this (it's kinda hard to - do remotely). - - * Only 8 fixed-rate speeds are supported. This is mainly a chipset - limitation. It may be possible to support other speeds in the future. - - * There is no support for the telephone mixer/codec. There is support - for a phonein/phoneout device in the mixer driver; whether or not - it does anything is anyone's guess. (Reports on this would be - appreciated. You'll have to figure out how to get the phone to - go off-hook before it'll work, tho.) - - * This driver was not written with any cooperation or support from - NeoMagic. If you have any questions about this, see their website - for their official stance on supporting open source drivers. - -============ -Video memory -============ - -The NeoMagic sound engine uses a portion of the display memory to hold -the sound buffer. (Crazy, eh?) The NeoMagic video BIOS sets up a -special pointer at the top of video RAM to indicate where the top of -the audio buffer should be placed. - -At the present time XFree86 is apparently not aware of this. It will -thus write over either the pointer or the sound buffer with abandon. -(Accelerated-X seems to do a better job here.) - -This implies a few things: - - * Sometimes the NM256 driver has to guess at where the buffer - should be placed, especially if the module is loaded after the - X server is started. It's usually correct, but it will consistently - fail on the Sony F250. - - * Virtual screens greater than 1024x768x16 under XFree86 are - problematic on laptops with only 2.5MB of screen RAM. This - includes all of the 256AV-equipped laptops. (Virtual displays - may or may not work on the 256ZX, which has at least 4MB of - video RAM.) - -If you start having problems with random noise being output either -constantly (this is the usual symptom on the F250), or when windows -are moved around (this is the usual symptom when using a virtual -screen), the best fix is to - - * Don't use a virtual frame buffer. - * Make sure you load the NM256 module before the X server is - started. - -On the F250, it is possible to force the driver to load properly even -after the XFree86 server is started by doing: - - insmod nm256 buffertop=0x25a800 - -This forces the audio buffers to the correct offset in screen RAM. - -One user has reported a similar problem on the Sony F270, although -others apparently aren't seeing any problems. His suggested command -is - - insmod nm256 buffertop=0x272800 - -================= -Official WWW site -================= - -The official site for the NM256 driver is: - - http://www.uglx.org/sony.html - -You should always be able to get the latest version of the driver there, -and the driver will be supported for the foreseeable future. - -============== -Z505RX and IDE -============== - -There appears to be a problem with the IDE chipset on the Z505RX; one -of the symptoms is that sound playback periodically hangs (when the -disk is accessed). The user reporting the problem also reported that -enabling all of the IDE chipset workarounds in the kernel solved the -problem, tho obviously only one of them should be needed--if someone -can give me more details I would appreciate it. - -============================== -Z505S/Z505SX on-board Ethernet -============================== - -If you're using the on-board Ethernet Pro/100 ethernet support on the Z505 -series, I strongly encourage you to download the latest eepro100 driver from -Donald Becker's site: - - ftp://cesdis.gsfc.nasa.gov/pub/linux/drivers/test/eepro100.c - -There was a reported problem on the Z505SX that if the ethernet -interface is disabled and reenabled while the sound driver is loaded, -the machine would lock up. I have included a workaround that is -working satisfactorily. However, you may occasionally see a message -about "Releasing interrupts, over 1000 bad interrupts" which indicates -that the workaround is doing its job. - -================================== -PCMCIA and the Z505S/Z505SX/Z505DX -================================== - -There is also a known problem with the Sony Z505S and Z505SX hanging -if a PCMCIA card is inserted while the ethernet driver is loaded, or -in some cases if the laptop is suspended. This is caused by tons of -spurious IRQ 9s, probably generated from the PCMCIA or ACPI bridges. - -There is currently no fix for the problem that works in every case. -The only known workarounds are to disable the ethernet interface -before inserting or removing a PCMCIA card, or with some cards -disabling the PCMCIA card before ejecting it will also help the -problem with the laptop hanging when the card is ejected. - -One user has reported that setting the tcic's cs_irq to some value -other than 9 (like 11) fixed the problem. This doesn't work on my -Z505S, however--changing the value causes the cardmgr to stop seeing -card insertions and removals, cards don't seem to work correctly, and -I still get hangs if a card is inserted when the kernel is booted. - -Using the latest ethernet driver and pcmcia package allows me to -insert an Adaptec 1480A SlimScsi card without the laptop hanging, -although I still have to shut down the card before ejecting or -powering down the laptop. However, similar experiments with a DE-660 -ethernet card still result in hangs when the card is inserted. I am -beginning to think that the interrupts are CardBus-related, since the -Adaptec card is a CardBus card, and the DE-660 is not; however, I -don't have any other CardBus cards to test with. - -====== -Thanks -====== - -First, I want to thank everyone (except NeoMagic of course) for their -generous support and encouragement. I'd like to list everyone's name -here that replied during the development phase, but the list is -amazingly long. - -I will be rather unfair and single out a few people, however: - - Justin Maurer, for being the first random net.person to try it, - and for letting me login to his Z505SX to get it working there - - Edi Weitz for trying out several different versions, and giving - me a lot of useful feedback - - Greg Rumple for letting me login remotely to get the driver - functional on the 256ZX, for his assistance on tracking - down all sorts of random stuff, and for trying out Accel-X - - Zach Brown, for the initial AC97 mixer interface design - - Jeff Garzik, for various helpful suggestions on the AC97 - interface - - "Mr. Bumpy" for feedback on the Z505RX - - Bill Nottingham, for generous assistance in getting the mixer ID - code working - -================= -Previous versions -================= - -Versions prior to 0.3 (aka `noname') had problems with weird artifacts -in the output and failed to set the recording rate properly. These -problems have long since been fixed. - -Versions prior to 0.5 had problems with clicks in the output when -anything other than 16-bit stereo sound was being played, and also had -periodic clicks when recording. - -Version 0.7 first incorporated support for the NM256ZX chipset, which -is found on some Dell Latitude laptops (the CPt, and apparently -some CPi models as well). It also included the generic AC97 -mixer module. - -Version 0.75 renamed all the functions and files with slightly more -generic names. - -Note that previous versions of this document claimed that recording was -8-bit only; it actually has been working for 16-bits all along. diff --git a/Documentation/sound/oss/OPL3-SA2 b/Documentation/sound/oss/OPL3-SA2 deleted file mode 100644 index d8b6d2bbada6..000000000000 --- a/Documentation/sound/oss/OPL3-SA2 +++ /dev/null @@ -1,210 +0,0 @@ -Documentation for the OPL3-SA2, SA3, and SAx driver (opl3sa2.o) ---------------------------------------------------------------- - -Scott Murray, scott@spiteful.org -January 7, 2001 - -NOTE: All trade-marked terms mentioned below are properties of their - respective owners. - - -Supported Devices ------------------ - -This driver is for PnP soundcards based on the following Yamaha audio -controller chipsets: - -YMF711 aka OPL3-SA2 -YMF715 and YMF719 aka OPL3-SA3 - -Up until recently (December 2000), I'd thought the 719 to be a -different chipset, the OPL3-SAx. After an email exhange with -Yamaha, however, it turns out that the 719 is just a re-badged -715, and the chipsets are identical. The chipset detection code -has been updated to reflect this. - -Anyways, all of these chipsets implement the following devices: - -OPL3 FM synthesizer -Soundblaster Pro -Microsoft/Windows Sound System -MPU401 MIDI interface - -Note that this driver uses the MSS device, and to my knowledge these -chipsets enforce an either/or situation with the Soundblaster Pro -device and the MSS device. Since the MSS device has better -capabilities, I have implemented the driver to use it. - - -Mixer Channels --------------- - -Older versions of this driver (pre-December 2000) had two mixers, -an OPL3-SA2 or SA3 mixer and a MSS mixer. The OPL3-SA[23] mixer -device contained a superset of mixer channels consisting of its own -channels and all of the MSS mixer channels. To simplify the driver -considerably, and to partition functionality better, the OPL3-SA[23] -mixer device now contains has its own specific mixer channels. They -are: - -Volume - Hardware master volume control -Bass - SA3 only, now supports left and right channels -Treble - SA3 only, now supports left and right channels -Microphone - Hardware microphone input volume control -Digital1 - Yamaha 3D enhancement "Wide" mixer - -All other mixer channels (e.g. "PCM", "CD", etc.) now have to be -controlled via the "MS Sound System (CS4231)" mixer. To facilitate -this, the mixer device creation order has been switched so that -the MSS mixer is created first. This allows accessing the majority -of the useful mixer channels even via single mixer-aware tools -such as "aumix". - - -Plug 'n Play ------------- - -In previous kernels (2.2.x), some configuration was required to -get the driver to talk to the card. Being the new millennium and -all, the 2.4.x kernels now support auto-configuration if ISA PnP -support is configured in. Theoretically, the driver even supports -having more than one card in this case. - -With the addition of PnP support to the driver, two new parameters -have been added to control it: - -isapnp - set to 0 to disable ISA PnP card detection - -multiple - set to 0 to disable multiple PnP card detection - - -Optional Parameters -------------------- - -Recent (December 2000) additions to the driver (based on a patch -provided by Peter Englmaier) are two new parameters: - -ymode - Set Yamaha 3D enhancement mode: - 0 = Desktop/Normal 5-12 cm speakers - 1 = Notebook PC (1) 3 cm speakers - 2 = Notebook PC (2) 1.5 cm speakers - 3 = Hi-Fi 16-38 cm speakers - -loopback - Set A/D input source. Useful for echo cancellation: - 0 = Mic Right channel (default) - 1 = Mono output loopback - -The ymode parameter has been tested and does work. The loopback -parameter, however, is untested. Any feedback on its usefulness -would be appreciated. - - -Manual Configuration --------------------- - -If for some reason you decide not to compile ISA PnP support into -your kernel, or disabled the driver's usage of it by setting the -isapnp parameter as discussed above, then you will need to do some -manual configuration. There are two ways of doing this. The most -common is to use the isapnptools package to initialize the card, and -use the kernel module form of the sound subsystem and sound drivers. -Alternatively, some BIOS's allow manual configuration of installed -PnP devices in a BIOS menu, which should allow using the non-modular -sound drivers, i.e. built into the kernel. - -I personally use isapnp and modules, and do not have access to a PnP -BIOS machine to test. If you have such a beast, configuring the -driver to be built into the kernel should just work (thanks to work -done by David Luyer <luyer@ucs.uwa.edu.au>). You will still need -to specify settings, which can be done by adding: - -opl3sa2=<io>,<irq>,<dma>,<dma2>,<mssio>,<mpuio> - -to the kernel command line. For example: - -opl3sa2=0x370,5,0,1,0x530,0x330 - -If you are instead using the isapnp tools (as most people have been -before Linux 2.4.x), follow the directions in their documentation to -produce a configuration file. Here is the relevant excerpt I used to -use for my SA3 card from my isapnp.conf: - -(CONFIGURE YMH0800/-1 (LD 0 - -# NOTE: IO 0 is for the unused SoundBlaster part of the chipset. -(IO 0 (BASE 0x0220)) -(IO 1 (BASE 0x0530)) -(IO 2 (BASE 0x0388)) -(IO 3 (BASE 0x0330)) -(IO 4 (BASE 0x0370)) -(INT 0 (IRQ 5 (MODE +E))) -(DMA 0 (CHANNEL 0)) -(DMA 1 (CHANNEL 1)) - -Here, note that: - -Port Acceptable Range Purpose ----- ---------------- ------- -IO 0 0x0220 - 0x0280 SB base address, unused. -IO 1 0x0530 - 0x0F48 MSS base address -IO 2 0x0388 - 0x03F8 OPL3 base address -IO 3 0x0300 - 0x0334 MPU base address -IO 4 0x0100 - 0x0FFE card's own base address for its control I/O ports - -The IRQ and DMA values can be any that are considered acceptable for a -MSS. Assuming you've got isapnp all happy, then you should be able to -do something like the following (which matches up with the isapnp -configuration above): - -modprobe mpu401 -modprobe ad1848 -modprobe opl3sa2 io=0x370 mss_io=0x530 mpu_io=0x330 irq=5 dma=0 dma2=1 -modprobe opl3 io=0x388 - -See the section "Automatic Module Loading" below for how to set up -/etc/modprobe.conf to automate this. - -An important thing to remember that the opl3sa2 module's io argument is -for it's own control port, which handles the card's master mixer for -volume (on all cards), and bass and treble (on SA3 cards). - - -Troubleshooting ---------------- - -If all goes well and you see no error messages, you should be able to -start using the sound capabilities of your system. If you get an -error message while trying to insert the opl3sa2 module, then make -sure that the values of the various arguments match what you specified -in your isapnp configuration file, and that there is no conflict with -another device for an I/O port or interrupt. Checking the contents of -/proc/ioports and /proc/interrupts can be useful to see if you're -butting heads with another device. - -If you still cannot get the module to load, look at the contents of -your system log file, usually /var/log/messages. If you see the -message "opl3sa2: Unknown Yamaha audio controller version", then you -have a different chipset version than I've encountered so far. Look -for all messages in the log file that start with "opl3sa2: " and see -if they provide any clues. If you do not see the chipset version -message, and none of the other messages present in the system log are -helpful, email me some details and I'll try my best to help. - - -Automatic Module Loading ------------------------- - -Lastly, if you're using modules and want to set up automatic module -loading with kmod, the kernel module loader, here is the section I -currently use in my modprobe.conf file: - -# Sound -alias sound-slot-0 opl3sa2 -options opl3sa2 io=0x370 mss_io=0x530 mpu_io=0x330 irq=7 dma=0 dma2=3 -options opl3 io=0x388 - -That's all it currently takes to get an OPL3-SA3 card working on my -system. Once again, if you have any other problems, email me at the -address listed above. - -Scott diff --git a/Documentation/sound/oss/VIA-chipset b/Documentation/sound/oss/VIA-chipset deleted file mode 100644 index 37865234e54d..000000000000 --- a/Documentation/sound/oss/VIA-chipset +++ /dev/null @@ -1,43 +0,0 @@ -Running sound cards on VIA chipsets - -o There are problems with VIA chipsets and sound cards that appear to - lock the hardware solidly. Test programs under DOS have verified the - problem exists on at least some (but apparently not all) VIA boards - -o VIA have so far failed to bother to answer support mail on the subject - so if you are a VIA engineer feeling aggrieved as you read this - document go chase your own people. If there is a workaround please - let us know so we can implement it. - - -Certain patterns of ISA DMA access used for most PC sound cards cause the -VIA chipsets to lock up. From the collected reports this appears to cover a -wide range of boards. Some also lock up with sound cards under Win* as well. - -Linux implements a workaround providing your chipset is PCI and you compiled -with PCI Quirks enabled. If so you will see a message - "Activating ISA DMA bug workarounds" - -during booting. If you have a VIA PCI chipset that hangs when you use the -sound and is not generating this message even with PCI quirks enabled -please report the information to the linux-kernel list (see REPORTING-BUGS). - -If you are one of the tiny number of unfortunates with a 486 ISA/VLB VIA -chipset board you need to do the following to build a special kernel for -your board - - edit linux/include/asm-i386/dma.h - -change - -#define isa_dma_bridge_buggy (0) - -to - -#define isa_dma_bridge_buggy (1) - -and rebuild a kernel without PCI quirk support. - - -Other than this particular glitch the VIA [M]VP* chipsets appear to work -perfectly with Linux. diff --git a/Documentation/sound/oss/cs46xx b/Documentation/sound/oss/cs46xx deleted file mode 100644 index b54432709863..000000000000 --- a/Documentation/sound/oss/cs46xx +++ /dev/null @@ -1,138 +0,0 @@ - -Documentation for the Cirrus Logic/Crystal SoundFusion cs46xx/cs4280 audio -controller chips (2001/05/11) - -The cs46xx audio driver supports the DSP line of Cirrus controllers. -Specifically, the cs4610, cs4612, cs4614, cs4622, cs4624, cs4630 and the cs4280 -products. This driver uses the generic ac97_codec driver for AC97 codec -support. - - -Features: - -Full Duplex Playback/Capture supported from 8k-48k. -16Bit Signed LE & 8Bit Unsigned, with Mono or Stereo supported. - -APM/PM - 2.2.x PM is enabled and functional. APM can also -be enabled for 2.4.x by modifying the CS46XX_ACPI_SUPPORT macro -definition. - -DMA playback buffer size is configurable from 16k (defaultorder=2) up to 2Meg -(defaultorder=11). DMA capture buffer size is fixed at a single 4k page as -two 2k fragments. - -MMAP seems to work well with QuakeIII, and test XMMS plugin. - -Myth2 works, but the polling logic is not fully correct, but is functional. - -The 2.4.4-ac6 gameport code in the cs461x joystick driver has been tested -with a Microsoft Sidewinder joystick (cs461x.o and sidewinder.o). This -audio driver must be loaded prior to the joystick driver to enable the -DSP task image supporting the joystick device. - - -Limitations: - -SPDIF is currently not supported. - -Primary codec support only. No secondary codec support is implemented. - - - -NOTES: - -Hercules Game Theatre XP - the EGPIO2 pin controls the external Amp, -and has been tested. -Module parameter hercules_egpio_disable set to 1, will force a 0 to EGPIODR -to disable the external amplifier. - -VTB Santa Cruz - the GPIO7/GPIO8 on the Secondary Codec control -the external amplifier for the "back" speakers, since we do not -support the secondary codec then this external amp is not -turned on. The primary codec external amplifier is supported but -note that the AC97 EAPD bit is inverted logic (amp_voyetra()). - -DMA buffer size - there are issues with many of the Linux applications -concerning the optimal buffer size. Several applications request a -certain fragment size and number and then do not verify that the driver -has the ability to support the requested configuration. -SNDCTL_DSP_SETFRAGMENT ioctl is used to request a fragment size and -number of fragments. Some applications exit if an error is returned -on this particular ioctl. Therefore, in alignment with the other OSS audio -drivers, no error is returned when a SETFRAGs IOCTL is received, but the -values passed from the app are not used in any buffer calculation -(ossfragshift/ossmaxfrags are not used). -Use the "defaultorder=N" module parameter to change the buffer size if -you have an application that requires a specific number of fragments -or a specific buffer size (see below). - -Debug Interface ---------------- -There is an ioctl debug interface to allow runtime modification of the -debug print levels. This debug interface code can be disabled from the -compilation process with commenting the following define: -#define CSDEBUG_INTERFACE 1 -There is also a debug print methodolgy to select printf statements from -different areas of the driver. A debug print level is also used to allow -additional printfs to be active. Comment out the following line in the -driver to disable compilation of the CS_DBGOUT print statements: -#define CSDEBUG 1 - -Please see the definitions for cs_debuglevel and cs_debugmask for additional -information on the debug levels and sections. - -There is also a csdbg executable to allow runtime manipulation of these -parameters. for a copy email: twoller@crystal.cirrus.com - - - -MODULE_PARMS definitions ------------------------- -module_param(defaultorder, ulong, 0); -defaultorder=N -where N is a value from 1 to 12 -The buffer order determines the size of the dma buffer for the driver. -under Linux, a smaller buffer allows more responsiveness from many of the -applications (e.g. games). A larger buffer allows some of the apps (esound) -to not underrun the dma buffer as easily. As default, use 32k (order=3) -rather than 64k as some of the games work more responsively. -(2^N) * PAGE_SIZE = allocated buffer size - -module_param(cs_debuglevel, ulong, 0644); -module_param(cs_debugmask, ulong, 0644); -cs_debuglevel=N -cs_debugmask=0xMMMMMMMM -where N is a value from 0 (no debug printfs), to 9 (maximum) -0xMMMMMMMM is a debug mask corresponding to the CS_xxx bits (see driver source). - -module_param(hercules_egpio_disable, ulong, 0); -hercules_egpio_disable=N -where N is a 0 (enable egpio), or a 1 (disable egpio support) - -module_param(initdelay, ulong, 0); -initdelay=N -This value is used to determine the millescond delay during the initialization -code prior to powering up the PLL. On laptops this value can be used to -assist with errors on resume, mostly with IBM laptops. Basically, if the -system is booted under battery power then the mdelay()/udelay() functions fail to -properly delay the required time. Also, if the system is booted under AC power -and then the power removed, the mdelay()/udelay() functions will not delay properly. - -module_param(powerdown, ulong, 0); -powerdown=N -where N is 0 (disable any powerdown of the internal blocks) or 1 (enable powerdown) - - -module_param(external_amp, bool, 0); -external_amp=1 -if N is set to 1, then force enabling the EAPD support in the primary AC97 codec. -override the detection logic and force the external amp bit in the AC97 0x26 register -to be reset (0). EAPD should be 0 for powerup, and 1 for powerdown. The VTB Santa Cruz -card has inverted logic, so there is a special function for these cards. - -module_param(thinkpad, bool, 0); -thinkpad=1 -if N is set to 1, then force enabling the clkrun functionality. -Currently, when the part is being used, then clkrun is disabled for the entire system, -but re-enabled when the driver is released or there is no outstanding open count. - diff --git a/Documentation/spi/spi-lm70llp b/Documentation/spi/spi-lm70llp new file mode 100644 index 000000000000..154bd02220b9 --- /dev/null +++ b/Documentation/spi/spi-lm70llp @@ -0,0 +1,69 @@ +spi_lm70llp : LM70-LLP parport-to-SPI adapter +============================================== + +Supported board/chip: + * National Semiconductor LM70 LLP evaluation board + Datasheet: http://www.national.com/pf/LM/LM70.html + +Author: + Kaiwan N Billimoria <kaiwan@designergraphix.com> + +Description +----------- +This driver provides glue code connecting a National Semiconductor LM70 LLP +temperature sensor evaluation board to the kernel's SPI core subsystem. + +In effect, this driver turns the parallel port interface on the eval board +into a SPI bus with a single device, which will be driven by the generic +LM70 driver (drivers/hwmon/lm70.c). + +The hardware interfacing on the LM70 LLP eval board is as follows: + + Parallel LM70 LLP + Port Direction JP2 Header + ----------- --------- ---------------- + D0 2 - - + D1 3 --> V+ 5 + D2 4 --> V+ 5 + D3 5 --> V+ 5 + D4 6 --> V+ 5 + D5 7 --> nCS 8 + D6 8 --> SCLK 3 + D7 9 --> SI/O 5 + GND 25 - GND 7 + Select 13 <-- SI/O 1 + ----------- --------- ---------------- + +Note that since the LM70 uses a "3-wire" variant of SPI, the SI/SO pin +is connected to both pin D7 (as Master Out) and Select (as Master In) +using an arrangment that lets either the parport or the LM70 pull the +pin low. This can't be shared with true SPI devices, but other 3-wire +devices might share the same SI/SO pin. + +The bitbanger routine in this driver (lm70_txrx) is called back from +the bound "hwmon/lm70" protocol driver through its sysfs hook, using a +spi_write_then_read() call. It performs Mode 0 (SPI/Microwire) bitbanging. +The lm70 driver then inteprets the resulting digital temperature value +and exports it through sysfs. + +A "gotcha": National Semiconductor's LM70 LLP eval board circuit schematic +shows that the SI/O line from the LM70 chip is connected to the base of a +transistor Q1 (and also a pullup, and a zener diode to D7); while the +collector is tied to VCC. + +Interpreting this circuit, when the LM70 SI/O line is High (or tristate +and not grounded by the host via D7), the transistor conducts and switches +the collector to zero, which is reflected on pin 13 of the DB25 parport +connector. When SI/O is Low (driven by the LM70 or the host) on the other +hand, the transistor is cut off and the voltage tied to it's collector is +reflected on pin 13 as a High level. + +So: the getmiso inline routine in this driver takes this fact into account, +inverting the value read at pin 13. + + +Thanks to +--------- +o David Brownell for mentoring the SPI-side driver development. +o Dr.Craig Hollabaugh for the (early) "manual" bitbanging driver version. +o Nadir Billimoria for help interpreting the circuit schematic. diff --git a/Documentation/spinlocks.txt b/Documentation/spinlocks.txt index a661d684768e..471e75389778 100644 --- a/Documentation/spinlocks.txt +++ b/Documentation/spinlocks.txt @@ -1,7 +1,12 @@ -UPDATE March 21 2005 Amit Gud <gud@eth.net> +SPIN_LOCK_UNLOCKED and RW_LOCK_UNLOCKED defeat lockdep state tracking and +are hence deprecated. -Macros SPIN_LOCK_UNLOCKED and RW_LOCK_UNLOCKED are deprecated and will be -removed soon. So for any new code dynamic initialization should be used: +Please use DEFINE_SPINLOCK()/DEFINE_RWLOCK() or +__SPIN_LOCK_UNLOCKED()/__RW_LOCK_UNLOCKED() as appropriate for static +initialization. + +Dynamic initialization, when necessary, may be performed as +demonstrated below. spinlock_t xxx_lock; rwlock_t xxx_rw_lock; @@ -15,12 +20,9 @@ removed soon. So for any new code dynamic initialization should be used: module_init(xxx_init); -Reasons for deprecation - - it hurts automatic lock validators - - it becomes intrusive for the realtime preemption patches - -Following discussion is still valid, however, with the dynamic initialization -of spinlocks instead of static. +The following discussion is still valid, however, with the dynamic +initialization of spinlocks or with DEFINE_SPINLOCK, etc., used +instead of SPIN_LOCK_UNLOCKED. ----------------------- diff --git a/Documentation/sysctl/ctl_unnumbered.txt b/Documentation/sysctl/ctl_unnumbered.txt new file mode 100644 index 000000000000..23003a8ea3e7 --- /dev/null +++ b/Documentation/sysctl/ctl_unnumbered.txt @@ -0,0 +1,22 @@ + +Except for a few extremely rare exceptions user space applications do not use +the binary sysctl interface. Instead everyone uses /proc/sys/... with +readable ascii names. + +Recently the kernel has started supporting setting the binary sysctl value to +CTL_UNNUMBERED so we no longer need to assign a binary sysctl path to allow +sysctls to show up in /proc/sys. + +Assigning binary sysctl numbers is an endless source of conflicts in sysctl.h, +breaking of the user space ABI (because of those conflicts), and maintenance +problems. A complete pass through all of the sysctl users revealed multiple +instances where the sysctl binary interface was broken and had gone undetected +for years. + +So please do not add new binary sysctl numbers. They are unneeded and +problematic. + +If you really need a new binary sysctl number please first merge your sysctl +into the kernel and then as a separate patch allocate a binary sysctl number. + +(ebiederm@xmission.com, June 2007) diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt index 1d192565e182..a0ccc5b60260 100644 --- a/Documentation/sysctl/vm.txt +++ b/Documentation/sysctl/vm.txt @@ -31,12 +31,15 @@ Currently, these files are in /proc/sys/vm: - min_unmapped_ratio - min_slab_ratio - panic_on_oom +- mmap_min_address +- numa_zonelist_order ============================================================== dirty_ratio, dirty_background_ratio, dirty_expire_centisecs, dirty_writeback_centisecs, vfs_cache_pressure, laptop_mode, -block_dump, swap_token_timeout, drop-caches: +block_dump, swap_token_timeout, drop-caches, +hugepages_treat_as_movable: See Documentation/filesystems/proc.txt @@ -216,3 +219,61 @@ above-mentioned. The default value is 0. 1 and 2 are for failover of clustering. Please select either according to your policy of failover. + +============================================================== + +mmap_min_addr + +This file indicates the amount of address space which a user process will +be restricted from mmaping. Since kernel null dereference bugs could +accidentally operate based on the information in the first couple of pages +of memory userspace processes should not be allowed to write to them. By +default this value is set to 0 and no protections will be enforced by the +security module. Setting this value to something like 64k will allow the +vast majority of applications to work correctly and provide defense in depth +against future potential kernel bugs. + +============================================================== + +numa_zonelist_order + +This sysctl is only for NUMA. +'where the memory is allocated from' is controlled by zonelists. +(This documentation ignores ZONE_HIGHMEM/ZONE_DMA32 for simple explanation. + you may be able to read ZONE_DMA as ZONE_DMA32...) + +In non-NUMA case, a zonelist for GFP_KERNEL is ordered as following. +ZONE_NORMAL -> ZONE_DMA +This means that a memory allocation request for GFP_KERNEL will +get memory from ZONE_DMA only when ZONE_NORMAL is not available. + +In NUMA case, you can think of following 2 types of order. +Assume 2 node NUMA and below is zonelist of Node(0)'s GFP_KERNEL + +(A) Node(0) ZONE_NORMAL -> Node(0) ZONE_DMA -> Node(1) ZONE_NORMAL +(B) Node(0) ZONE_NORMAL -> Node(1) ZONE_NORMAL -> Node(0) ZONE_DMA. + +Type(A) offers the best locality for processes on Node(0), but ZONE_DMA +will be used before ZONE_NORMAL exhaustion. This increases possibility of +out-of-memory(OOM) of ZONE_DMA because ZONE_DMA is tend to be small. + +Type(B) cannot offer the best locality but is more robust against OOM of +the DMA zone. + +Type(A) is called as "Node" order. Type (B) is "Zone" order. + +"Node order" orders the zonelists by node, then by zone within each node. +Specify "[Nn]ode" for zone order + +"Zone Order" orders the zonelists by zone type, then by node within each +zone. Specify "[Zz]one"for zode order. + +Specify "[Dd]efault" to request automatic configuration. Autoconfiguration +will select "node" order in following case. +(1) if the DMA zone does not exist or +(2) if the DMA zone comprises greater than 50% of the available memory or +(3) if any node's DMA zone comprises greater than 60% of its local memory and + the amount of local memory is big enough. + +Otherwise, "zone" order will be selected. Default order is recommended unless +this is causing problems for your system/application. diff --git a/Documentation/sysfs-rules.txt b/Documentation/sysfs-rules.txt new file mode 100644 index 000000000000..42861bb0bc9b --- /dev/null +++ b/Documentation/sysfs-rules.txt @@ -0,0 +1,166 @@ +Rules on how to access information in the Linux kernel sysfs + +The kernel exported sysfs exports internal kernel implementation-details +and depends on internal kernel structures and layout. It is agreed upon +by the kernel developers that the Linux kernel does not provide a stable +internal API. As sysfs is a direct export of kernel internal +structures, the sysfs interface can not provide a stable interface eighter, +it may always change along with internal kernel changes. + +To minimize the risk of breaking users of sysfs, which are in most cases +low-level userspace applications, with a new kernel release, the users +of sysfs must follow some rules to use an as abstract-as-possible way to +access this filesystem. The current udev and HAL programs already +implement this and users are encouraged to plug, if possible, into the +abstractions these programs provide instead of accessing sysfs +directly. + +But if you really do want or need to access sysfs directly, please follow +the following rules and then your programs should work with future +versions of the sysfs interface. + +- Do not use libsysfs + It makes assumptions about sysfs which are not true. Its API does not + offer any abstraction, it exposes all the kernel driver-core + implementation details in its own API. Therefore it is not better than + reading directories and opening the files yourself. + Also, it is not actively maintained, in the sense of reflecting the + current kernel-development. The goal of providing a stable interface + to sysfs has failed, it causes more problems, than it solves. It + violates many of the rules in this document. + +- sysfs is always at /sys + Parsing /proc/mounts is a waste of time. Other mount points are a + system configuration bug you should not try to solve. For test cases, + possibly support a SYSFS_PATH environment variable to overwrite the + applications behavior, but never try to search for sysfs. Never try + to mount it, if you are not an early boot script. + +- devices are only "devices" + There is no such thing like class-, bus-, physical devices, + interfaces, and such that you can rely on in userspace. Everything is + just simply a "device". Class-, bus-, physical, ... types are just + kernel implementation details, which should not be expected by + applications that look for devices in sysfs. + + The properties of a device are: + o devpath (/devices/pci0000:00/0000:00:1d.1/usb2/2-2/2-2:1.0) + - identical to the DEVPATH value in the event sent from the kernel + at device creation and removal + - the unique key to the device at that point in time + - the kernels path to the device-directory without the leading + /sys, and always starting with with a slash + - all elements of a devpath must be real directories. Symlinks + pointing to /sys/devices must always be resolved to their real + target, and the target path must be used to access the device. + That way the devpath to the device matches the devpath of the + kernel used at event time. + - using or exposing symlink values as elements in a devpath string + is a bug in the application + + o kernel name (sda, tty, 0000:00:1f.2, ...) + - a directory name, identical to the last element of the devpath + - applications need to handle spaces and characters like '!' in + the name + + o subsystem (block, tty, pci, ...) + - simple string, never a path or a link + - retrieved by reading the "subsystem"-link and using only the + last element of the target path + + o driver (tg3, ata_piix, uhci_hcd) + - a simple string, which may contain spaces, never a path or a + link + - it is retrieved by reading the "driver"-link and using only the + last element of the target path + - devices which do not have "driver"-link, just do not have a + driver; copying the driver value in a child device context, is a + bug in the application + + o attributes + - the files in the device directory or files below a subdirectories + of the same device directory + - accessing attributes reached by a symlink pointing to another device, + like the "device"-link, is a bug in the application + + Everything else is just a kernel driver-core implementation detail, + that should not be assumed to be stable across kernel releases. + +- Properties of parent devices never belong into a child device. + Always look at the parent devices themselves for determining device + context properties. If the device 'eth0' or 'sda' does not have a + "driver"-link, then this device does not have a driver. Its value is empty. + Never copy any property of the parent-device into a child-device. Parent + device-properties may change dynamically without any notice to the + child device. + +- Hierarchy in a single device-tree + There is only one valid place in sysfs where hierarchy can be examined + and this is below: /sys/devices. + It is planned, that all device directories will end up in the tree + below this directory. + +- Classification by subsystem + There are currently three places for classification of devices: + /sys/block, /sys/class and /sys/bus. It is planned that these will + not contain any device-directories themselves, but only flat lists of + symlinks pointing to the unified /sys/devices tree. + All three places have completely different rules on how to access + device information. It is planned to merge all three + classification-directories into one place at /sys/subsystem, + following the layout of the bus-directories. All buses and + classes, including the converted block-subsystem, will show up + there. + The devices belonging to a subsystem will create a symlink in the + "devices" directory at /sys/subsystem/<name>/devices. + + If /sys/subsystem exists, /sys/bus, /sys/class and /sys/block can be + ignored. If it does not exist, you have always to scan all three + places, as the kernel is free to move a subsystem from one place to + the other, as long as the devices are still reachable by the same + subsystem name. + + Assuming /sys/class/<subsystem> and /sys/bus/<subsystem>, or + /sys/block and /sys/class/block are not interchangeable, is a bug in + the application. + +- Block + The converted block-subsystem at /sys/class/block, or + /sys/subsystem/block will contain the links for disks and partitions + at the same level, never in a hierarchy. Assuming the block-subsytem to + contain only disks and not partition-devices in the same flat list is + a bug in the application. + +- "device"-link and <subsystem>:<kernel name>-links + Never depend on the "device"-link. The "device"-link is a workaround + for the old layout, where class-devices are not created in + /sys/devices/ like the bus-devices. If the link-resolving of a + device-directory does not end in /sys/devices/, you can use the + "device"-link to find the parent devices in /sys/devices/. That is the + single valid use of the "device"-link, it must never appear in any + path as an element. Assuming the existence of the "device"-link for + a device in /sys/devices/ is a bug in the application. + Accessing /sys/class/net/eth0/device is a bug in the application. + + Never depend on the class-specific links back to the /sys/class + directory. These links are also a workaround for the design mistake + that class-devices are not created in /sys/devices. If a device + directory does not contain directories for child devices, these links + may be used to find the child devices in /sys/class. That is the single + valid use of these links, they must never appear in any path as an + element. Assuming the existence of these links for devices which are + real child device directories in the /sys/devices tree, is a bug in + the application. + + It is planned to remove all these links when when all class-device + directories live in /sys/devices. + +- Position of devices along device chain can change. + Never depend on a specific parent device position in the devpath, + or the chain of parent devices. The kernel is free to insert devices into + the chain. You must always request the parent device you are looking for + by its subsystem value. You need to walk up the chain until you find + the device that matches the expected subsystem. Depending on a specific + position of a parent device, or exposing relative paths, using "../" to + access the chain of parents, is a bug in the application. + diff --git a/Documentation/thinkpad-acpi.txt b/Documentation/thinkpad-acpi.txt index 2d4803359a04..9e6b94face4b 100644 --- a/Documentation/thinkpad-acpi.txt +++ b/Documentation/thinkpad-acpi.txt @@ -138,7 +138,7 @@ Hot keys -------- procfs: /proc/acpi/ibm/hotkey -sysfs device attribute: hotkey/* +sysfs device attribute: hotkey_* Without this driver, only the Fn-F4 key (sleep button) generates an ACPI event. With the driver loaded, the hotkey feature enabled and the @@ -196,10 +196,7 @@ The following commands can be written to the /proc/acpi/ibm/hotkey file: sysfs notes: - The hot keys attributes are in a hotkey/ subdirectory off the - thinkpad device. - - bios_enabled: + hotkey_bios_enabled: Returns the status of the hot keys feature when thinkpad-acpi was loaded. Upon module unload, the hot key feature status will be restored to this value. @@ -207,19 +204,19 @@ sysfs notes: 0: hot keys were disabled 1: hot keys were enabled - bios_mask: + hotkey_bios_mask: Returns the hot keys mask when thinkpad-acpi was loaded. Upon module unload, the hot keys mask will be restored to this value. - enable: + hotkey_enable: Enables/disables the hot keys feature, and reports current status of the hot keys feature. 0: disables the hot keys feature / feature disabled 1: enables the hot keys feature / feature enabled - mask: + hotkey_mask: bit mask to enable ACPI event generation for each hot key (see above). Returns the current status of the hot keys mask, and allows one to modify it. @@ -229,7 +226,7 @@ Bluetooth --------- procfs: /proc/acpi/ibm/bluetooth -sysfs device attribute: bluetooth/enable +sysfs device attribute: bluetooth_enable This feature shows the presence and current state of a ThinkPad Bluetooth device in the internal ThinkPad CDC slot. @@ -244,7 +241,7 @@ If Bluetooth is installed, the following commands can be used: Sysfs notes: If the Bluetooth CDC card is installed, it can be enabled / - disabled through the "bluetooth/enable" thinkpad-acpi device + disabled through the "bluetooth_enable" thinkpad-acpi device attribute, and its current status can also be queried. enable: @@ -252,7 +249,7 @@ Sysfs notes: 1: enables Bluetooth / Bluetooth is enabled. Note: this interface will be probably be superseeded by the - generic rfkill class. + generic rfkill class, so it is NOT to be considered stable yet. Video output control -- /proc/acpi/ibm/video -------------------------------------------- @@ -898,7 +895,7 @@ EXPERIMENTAL: WAN ----------------- procfs: /proc/acpi/ibm/wan -sysfs device attribute: wwan/enable +sysfs device attribute: wwan_enable This feature is marked EXPERIMENTAL because the implementation directly accesses hardware registers and may not work as expected. USE @@ -921,7 +918,7 @@ If the W-WAN card is installed, the following commands can be used: Sysfs notes: If the W-WAN card is installed, it can be enabled / - disabled through the "wwan/enable" thinkpad-acpi device + disabled through the "wwan_enable" thinkpad-acpi device attribute, and its current status can also be queried. enable: @@ -929,7 +926,7 @@ Sysfs notes: 1: enables WWAN card / WWAN card is enabled. Note: this interface will be probably be superseeded by the - generic rfkill class. + generic rfkill class, so it is NOT to be considered stable yet. Multiple Commands, Module Parameters ------------------------------------ diff --git a/Documentation/time_interpolators.txt b/Documentation/time_interpolators.txt deleted file mode 100644 index e3b60854fbc2..000000000000 --- a/Documentation/time_interpolators.txt +++ /dev/null @@ -1,41 +0,0 @@ -Time Interpolators ------------------- - -Time interpolators are a base of time calculation between timer ticks and -allow an accurate determination of time down to the accuracy of the time -source in nanoseconds. - -The architecture specific code typically provides gettimeofday and -settimeofday under Linux. The time interpolator provides both if an arch -defines CONFIG_TIME_INTERPOLATION. The arch still must set up timer tick -operations and call the necessary functions to advance the clock. - -With the time interpolator a standardized interface exists for time -interpolation between ticks. The provided logic is highly scalable -and has been tested in SMP situations of up to 512 CPUs. - -If CONFIG_TIME_INTERPOLATION is defined then the architecture specific code -(or the device drivers - like HPET) may register time interpolators. -These are typically defined in the following way: - -static struct time_interpolator my_interpolator { - .frequency = MY_FREQUENCY, - .source = TIME_SOURCE_MMIO32, - .shift = 8, /* scaling for higher accuracy */ - .drift = -1, /* Unknown drift */ - .jitter = 0 /* time source is stable */ -}; - -void time_init(void) -{ - .... - /* Initialization of the timer *. - my_interpolator.address = &my_timer; - register_time_interpolator(&my_interpolator); - .... -} - -For more details see include/linux/timex.h and kernel/timer.c. - -Christoph Lameter <christoph@lameter.com>, October 31, 2004 - diff --git a/Documentation/usb/dma.txt b/Documentation/usb/dma.txt index 62844aeba69c..e8b50b7de9d9 100644 --- a/Documentation/usb/dma.txt +++ b/Documentation/usb/dma.txt @@ -32,12 +32,15 @@ ELIMINATING COPIES It's good to avoid making CPUs copy data needlessly. The costs can add up, and effects like cache-trashing can impose subtle penalties. -- When you're allocating a buffer for DMA purposes anyway, use the buffer - primitives. Think of them as kmalloc and kfree that give you the right - kind of addresses to store in urb->transfer_buffer and urb->transfer_dma, - while guaranteeing that no hidden copies through DMA "bounce" buffers will - slow things down. You'd also set URB_NO_TRANSFER_DMA_MAP in - urb->transfer_flags: +- If you're doing lots of small data transfers from the same buffer all + the time, that can really burn up resources on systems which use an + IOMMU to manage the DMA mappings. It can cost MUCH more to set up and + tear down the IOMMU mappings with each request than perform the I/O! + + For those specific cases, USB has primitives to allocate less expensive + memory. They work like kmalloc and kfree versions that give you the right + kind of addresses to store in urb->transfer_buffer and urb->transfer_dma. + You'd also set URB_NO_TRANSFER_DMA_MAP in urb->transfer_flags: void *usb_buffer_alloc (struct usb_device *dev, size_t size, int mem_flags, dma_addr_t *dma); @@ -45,6 +48,10 @@ and effects like cache-trashing can impose subtle penalties. void usb_buffer_free (struct usb_device *dev, size_t size, void *addr, dma_addr_t dma); + Most drivers should *NOT* be using these primitives; they don't need + to use this type of memory ("dma-coherent"), and memory returned from + kmalloc() will work just fine. + For control transfers you can use the buffer primitives or not for each of the transfer buffer and setup buffer independently. Set the flag bits URB_NO_TRANSFER_DMA_MAP and URB_NO_SETUP_DMA_MAP to indicate which @@ -54,29 +61,39 @@ and effects like cache-trashing can impose subtle penalties. The memory buffer returned is "dma-coherent"; sometimes you might need to force a consistent memory access ordering by using memory barriers. It's not using a streaming DMA mapping, so it's good for small transfers on - systems where the I/O would otherwise tie up an IOMMU mapping. (See + systems where the I/O would otherwise thrash an IOMMU mapping. (See Documentation/DMA-mapping.txt for definitions of "coherent" and "streaming" DMA mappings.) Asking for 1/Nth of a page (as well as asking for N pages) is reasonably space-efficient. + On most systems the memory returned will be uncached, because the + semantics of dma-coherent memory require either bypassing CPU caches + or using cache hardware with bus-snooping support. While x86 hardware + has such bus-snooping, many other systems use software to flush cache + lines to prevent DMA conflicts. + - Devices on some EHCI controllers could handle DMA to/from high memory. - Driver probe() routines can notice this using a generic DMA call, then - tell higher level code (network, scsi, etc) about it like this: - if (dma_supported (&intf->dev, 0xffffffffffffffffULL)) - net->features |= NETIF_F_HIGHDMA; + Unfortunately, the current Linux DMA infrastructure doesn't have a sane + way to expose these capabilities ... and in any case, HIGHMEM is mostly a + design wart specific to x86_32. So your best bet is to ensure you never + pass a highmem buffer into a USB driver. That's easy; it's the default + behavior. Just don't override it; e.g. with NETIF_F_HIGHDMA. - That can eliminate dma bounce buffering of requests that originate (or - terminate) in high memory, in cases where the buffers aren't allocated - with usb_buffer_alloc() but instead are dma-mapped. + This may force your callers to do some bounce buffering, copying from + high memory to "normal" DMA memory. If you can come up with a good way + to fix this issue (for x86_32 machines with over 1 GByte of memory), + feel free to submit patches. WORKING WITH EXISTING BUFFERS Existing buffers aren't usable for DMA without first being mapped into the -DMA address space of the device. +DMA address space of the device. However, most buffers passed to your +driver can safely be used with such DMA mapping. (See the first section +of DMA-mapping.txt, titled "What memory is DMA-able?") - When you're using scatterlists, you can map everything at once. On some systems, this kicks in an IOMMU and turns the scatterlists into single @@ -114,3 +131,8 @@ DMA address space of the device. The calls manage urb->transfer_dma for you, and set URB_NO_TRANSFER_DMA_MAP so that usbcore won't map or unmap the buffer. The same goes for urb->setup_dma and URB_NO_SETUP_DMA_MAP for control requests. + +Note that several of those interfaces are currently commented out, since +they don't have current users. See the source code. Other than the dmasync +calls (where the underlying DMA primitives have changed), most of them can +easily be commented back in if you want to use them. diff --git a/Documentation/usb/persist.txt b/Documentation/usb/persist.txt new file mode 100644 index 000000000000..df54d645cbb5 --- /dev/null +++ b/Documentation/usb/persist.txt @@ -0,0 +1,156 @@ + USB device persistence during system suspend + + Alan Stern <stern@rowland.harvard.edu> + + September 2, 2006 (Updated May 29, 2007) + + + What is the problem? + +According to the USB specification, when a USB bus is suspended the +bus must continue to supply suspend current (around 1-5 mA). This +is so that devices can maintain their internal state and hubs can +detect connect-change events (devices being plugged in or unplugged). +The technical term is "power session". + +If a USB device's power session is interrupted then the system is +required to behave as though the device has been unplugged. It's a +conservative approach; in the absence of suspend current the computer +has no way to know what has actually happened. Perhaps the same +device is still attached or perhaps it was removed and a different +device plugged into the port. The system must assume the worst. + +By default, Linux behaves according to the spec. If a USB host +controller loses power during a system suspend, then when the system +wakes up all the devices attached to that controller are treated as +though they had disconnected. This is always safe and it is the +"officially correct" thing to do. + +For many sorts of devices this behavior doesn't matter in the least. +If the kernel wants to believe that your USB keyboard was unplugged +while the system was asleep and a new keyboard was plugged in when the +system woke up, who cares? It'll still work the same when you type on +it. + +Unfortunately problems _can_ arise, particularly with mass-storage +devices. The effect is exactly the same as if the device really had +been unplugged while the system was suspended. If you had a mounted +filesystem on the device, you're out of luck -- everything in that +filesystem is now inaccessible. This is especially annoying if your +root filesystem was located on the device, since your system will +instantly crash. + +Loss of power isn't the only mechanism to worry about. Anything that +interrupts a power session will have the same effect. For example, +even though suspend current may have been maintained while the system +was asleep, on many systems during the initial stages of wakeup the +firmware (i.e., the BIOS) resets the motherboard's USB host +controllers. Result: all the power sessions are destroyed and again +it's as though you had unplugged all the USB devices. Yes, it's +entirely the BIOS's fault, but that doesn't do _you_ any good unless +you can convince the BIOS supplier to fix the problem (lots of luck!). + +On many systems the USB host controllers will get reset after a +suspend-to-RAM. On almost all systems, no suspend current is +available during hibernation (also known as swsusp or suspend-to-disk). +You can check the kernel log after resuming to see if either of these +has happened; look for lines saying "root hub lost power or was reset". + +In practice, people are forced to unmount any filesystems on a USB +device before suspending. If the root filesystem is on a USB device, +the system can't be suspended at all. (All right, it _can_ be +suspended -- but it will crash as soon as it wakes up, which isn't +much better.) + + + What is the solution? + +Setting CONFIG_USB_PERSIST will cause the kernel to work around these +issues. It enables a mode in which the core USB device data +structures are allowed to persist across a power-session disruption. +It works like this. If the kernel sees that a USB host controller is +not in the expected state during resume (i.e., if the controller was +reset or otherwise had lost power) then it applies a persistence check +to each of the USB devices below that controller for which the +"persist" attribute is set. It doesn't try to resume the device; that +can't work once the power session is gone. Instead it issues a USB +port reset and then re-enumerates the device. (This is exactly the +same thing that happens whenever a USB device is reset.) If the +re-enumeration shows that the device now attached to that port has the +same descriptors as before, including the Vendor and Product IDs, then +the kernel continues to use the same device structure. In effect, the +kernel treats the device as though it had merely been reset instead of +unplugged. + +If no device is now attached to the port, or if the descriptors are +different from what the kernel remembers, then the treatment is what +you would expect. The kernel destroys the old device structure and +behaves as though the old device had been unplugged and a new device +plugged in, just as it would without the CONFIG_USB_PERSIST option. + +The end result is that the USB device remains available and usable. +Filesystem mounts and memory mappings are unaffected, and the world is +now a good and happy place. + +Note that even when CONFIG_USB_PERSIST is set, the "persist" feature +will be applied only to those devices for which it is enabled. You +can enable the feature by doing (as root): + + echo 1 >/sys/bus/usb/devices/.../power/persist + +where the "..." should be filled in the with the device's ID. Disable +the feature by writing 0 instead of 1. For hubs the feature is +automatically and permanently enabled, so you only have to worry about +setting it for devices where it really matters. + + + Is this the best solution? + +Perhaps not. Arguably, keeping track of mounted filesystems and +memory mappings across device disconnects should be handled by a +centralized Logical Volume Manager. Such a solution would allow you +to plug in a USB flash device, create a persistent volume associated +with it, unplug the flash device, plug it back in later, and still +have the same persistent volume associated with the device. As such +it would be more far-reaching than CONFIG_USB_PERSIST. + +On the other hand, writing a persistent volume manager would be a big +job and using it would require significant input from the user. This +solution is much quicker and easier -- and it exists now, a giant +point in its favor! + +Furthermore, the USB_PERSIST option applies to _all_ USB devices, not +just mass-storage devices. It might turn out to be equally useful for +other device types, such as network interfaces. + + + WARNING: Using CONFIG_USB_PERSIST can be dangerous!! + +When recovering an interrupted power session the kernel does its best +to make sure the USB device hasn't been changed; that is, the same +device is still plugged into the port as before. But the checks +aren't guaranteed to be 100% accurate. + +If you replace one USB device with another of the same type (same +manufacturer, same IDs, and so on) there's an excellent chance the +kernel won't detect the change. Serial numbers and other strings are +not compared. In many cases it wouldn't help if they were, because +manufacturers frequently omit serial numbers entirely in their +devices. + +Furthermore it's quite possible to leave a USB device exactly the same +while changing its media. If you replace the flash memory card in a +USB card reader while the system is asleep, the kernel will have no +way to know you did it. The kernel will assume that nothing has +happened and will continue to use the partition tables, inodes, and +memory mappings for the old card. + +If the kernel gets fooled in this way, it's almost certain to cause +data corruption and to crash your system. You'll have no one to blame +but yourself. + +YOU HAVE BEEN WARNED! USE AT YOUR OWN RISK! + +That having been said, most of the time there shouldn't be any trouble +at all. The "persist" feature can be extremely useful. Make the most +of it. diff --git a/Documentation/video4linux/CARDLIST.bttv b/Documentation/video4linux/CARDLIST.bttv index b60639130a51..177159c5f4c4 100644 --- a/Documentation/video4linux/CARDLIST.bttv +++ b/Documentation/video4linux/CARDLIST.bttv @@ -66,7 +66,7 @@ 65 -> Lifeview FlyVideo 2000S LR90 66 -> Terratec TValueRadio [153b:1135,153b:ff3b] 67 -> IODATA GV-BCTV4/PCI [10fc:4050] - 68 -> 3Dfx VoodooTV FM (Euro), VoodooTV 200 (USA) [121a:3000,10b4:2637] + 68 -> 3Dfx VoodooTV FM (Euro) [10b4:2637] 69 -> Active Imaging AIMMS 70 -> Prolink Pixelview PV-BT878P+ (Rev.4C,8E) 71 -> Lifeview FlyVideo 98EZ (capture only) LR51 [1851:1851] @@ -145,3 +145,5 @@ 144 -> MagicTV 145 -> SSAI Security Video Interface [4149:5353] 146 -> SSAI Ultrasound Video Interface [414a:5353] +147 -> VoodooTV 200 (USA) [121a:3000] +148 -> DViCO FusionHDTV 2 [dbc0:d200] diff --git a/Documentation/video4linux/CARDLIST.cx88 b/Documentation/video4linux/CARDLIST.cx88 index 60f838beb9c8..82ac8250e978 100644 --- a/Documentation/video4linux/CARDLIST.cx88 +++ b/Documentation/video4linux/CARDLIST.cx88 @@ -55,3 +55,4 @@ 54 -> Norwood Micro TV Tuner 55 -> Shenzhen Tungsten Ages Tech TE-DTV-250 / Swann OEM [c180:c980] 56 -> Hauppauge WinTV-HVR1300 DVB-T/Hybrid MPEG Encoder [0070:9600,0070:9601,0070:9602] + 57 -> ADS Tech Instant Video PCI [1421:0390] diff --git a/Documentation/video4linux/CARDLIST.saa7134 b/Documentation/video4linux/CARDLIST.saa7134 index 712e8c8333cc..3f8aeab50a10 100644 --- a/Documentation/video4linux/CARDLIST.saa7134 +++ b/Documentation/video4linux/CARDLIST.saa7134 @@ -114,3 +114,4 @@ 113 -> Elitegroup ECS TVP3XP FM1246 Tuner Card (PAL,FM) [1019:4cb6] 114 -> KWorld DVB-T 210 [17de:7250] 115 -> Sabrent PCMCIA TV-PCB05 [0919:2003] +116 -> 10MOONS TM300 TV Card [1131:2304] diff --git a/Documentation/video4linux/CARDLIST.tuner b/Documentation/video4linux/CARDLIST.tuner index 44134f04b82a..a88c02d23805 100644 --- a/Documentation/video4linux/CARDLIST.tuner +++ b/Documentation/video4linux/CARDLIST.tuner @@ -40,7 +40,7 @@ tuner=38 - Philips PAL/SECAM multi (FM1216ME MK3) tuner=39 - LG NTSC (newer TAPC series) tuner=40 - HITACHI V7-J180AT tuner=41 - Philips PAL_MK (FI1216 MK) -tuner=42 - Philips 1236D ATSC/NTSC dual in +tuner=42 - Philips FCV1236D ATSC/NTSC dual in tuner=43 - Philips NTSC MK3 (FM1236MK3 or FM1236/F) tuner=44 - Philips 4 in 1 (ATI TV Wonder Pro/Conexant) tuner=45 - Microtune 4049 FM5 @@ -72,3 +72,4 @@ tuner=70 - Samsung TCPN 2121P30A tuner=71 - Xceive xc3028 tuner=72 - Thomson FE6600 tuner=73 - Samsung TCPG 6121P30A +tuner=75 - Philips TEA5761 FM Radio diff --git a/Documentation/video4linux/sn9c102.txt b/Documentation/video4linux/sn9c102.txt index 279717c96f63..1ffad19ce891 100644 --- a/Documentation/video4linux/sn9c102.txt +++ b/Documentation/video4linux/sn9c102.txt @@ -436,7 +436,7 @@ HV7131D Hynix Semiconductor | Yes No No No HV7131R Hynix Semiconductor | No Yes Yes Yes MI-0343 Micron Technology | Yes No No No MI-0360 Micron Technology | No Yes Yes Yes -OV7630 OmniVision Technologies | Yes Yes No No +OV7630 OmniVision Technologies | Yes Yes Yes Yes OV7660 OmniVision Technologies | No No Yes Yes PAS106B PixArt Imaging | Yes No No No PAS202B PixArt Imaging | Yes Yes No No @@ -583,6 +583,7 @@ order): - Bertrik Sikken, who reverse-engineered and documented the Huffman compression algorithm used in the SN9C101, SN9C102 and SN9C103 controllers and implemented the first decoder; +- Ronny Standke for the donation of a webcam; - Mizuno Takafumi for the donation of a webcam; - an "anonymous" donator (who didn't want his name to be revealed) for the donation of a webcam. diff --git a/Documentation/video4linux/zr364xx.txt b/Documentation/video4linux/zr364xx.txt index c76992d0ff4d..4d9a0c33f2fd 100644 --- a/Documentation/video4linux/zr364xx.txt +++ b/Documentation/video4linux/zr364xx.txt @@ -62,4 +62,4 @@ Vendor Product Distributor Model 0x0784 0x0040 Traveler Slimline X5 0x06d6 0x0034 Trust Powerc@m 750 0x0a17 0x0062 Pentax Optio 50L - +0x06d6 0x003b Trust Powerc@m 970Z diff --git a/Documentation/vm/hugetlbpage.txt b/Documentation/vm/hugetlbpage.txt index 687104bfd09a..51ccc48aa763 100644 --- a/Documentation/vm/hugetlbpage.txt +++ b/Documentation/vm/hugetlbpage.txt @@ -77,8 +77,9 @@ If the user applications are going to request hugepages using mmap system call, then it is required that system administrator mount a file system of type hugetlbfs: - mount none /mnt/huge -t hugetlbfs <uid=value> <gid=value> <mode=value> - <size=value> <nr_inodes=value> + mount -t hugetlbfs \ + -o uid=<value>,gid=<value>,mode=<value>,size=<value>,nr_inodes=<value> \ + none /mnt/huge This command mounts a (pseudo) filesystem of type hugetlbfs on the directory /mnt/huge. Any files created on /mnt/huge uses hugepages. The uid and gid @@ -88,11 +89,10 @@ mode of root of file system to value & 0777. This value is given in octal. By default the value 0755 is picked. The size option sets the maximum value of memory (huge pages) allowed for that filesystem (/mnt/huge). The size is rounded down to HPAGE_SIZE. The option nr_inodes sets the maximum number of -inodes that /mnt/huge can use. If the size or nr_inodes options are not +inodes that /mnt/huge can use. If the size or nr_inodes option is not provided on command line then no limits are set. For size and nr_inodes options, you can use [G|g]/[M|m]/[K|k] to represent giga/mega/kilo. For -example, size=2K has the same meaning as size=2048. An example is given at -the end of this document. +example, size=2K has the same meaning as size=2048. read and write system calls are not supported on files that reside on hugetlb file systems. diff --git a/Documentation/vm/slub.txt b/Documentation/vm/slub.txt index 1523320abd87..d17f324db9f5 100644 --- a/Documentation/vm/slub.txt +++ b/Documentation/vm/slub.txt @@ -41,6 +41,8 @@ Possible debug options are P Poisoning (object and padding) U User tracking (free and alloc) T Trace (please only use on single slabs) + - Switch all debugging off (useful if the kernel is + configured with CONFIG_SLUB_DEBUG_ON) F.e. in order to boot just with sanity checks and red zoning one would specify: @@ -125,13 +127,20 @@ SLUB Debug output Here is a sample of slub debug output: -*** SLUB kmalloc-8: Redzone Active@0xc90f6d20 slab 0xc528c530 offset=3360 flags=0x400000c3 inuse=61 freelist=0xc90f6d58 - Bytes b4 0xc90f6d10: 00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ - Object 0xc90f6d20: 31 30 31 39 2e 30 30 35 1019.005 - Redzone 0xc90f6d28: 00 cc cc cc . -FreePointer 0xc90f6d2c -> 0xc90f6d58 -Last alloc: get_modalias+0x61/0xf5 jiffies_ago=53 cpu=1 pid=554 -Filler 0xc90f6d50: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ +==================================================================== +BUG kmalloc-8: Redzone overwritten +-------------------------------------------------------------------- + +INFO: 0xc90f6d28-0xc90f6d2b. First byte 0x00 instead of 0xcc +INFO: Slab 0xc528c530 flags=0x400000c3 inuse=61 fp=0xc90f6d58 +INFO: Object 0xc90f6d20 @offset=3360 fp=0xc90f6d58 +INFO: Allocated in get_modalias+0x61/0xf5 age=53 cpu=1 pid=554 + +Bytes b4 0xc90f6d10: 00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ + Object 0xc90f6d20: 31 30 31 39 2e 30 30 35 1019.005 + Redzone 0xc90f6d28: 00 cc cc cc . + Padding 0xc90f6d50: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ + [<c010523d>] dump_trace+0x63/0x1eb [<c01053df>] show_trace_log_lvl+0x1a/0x2f [<c010601d>] show_trace+0x12/0x14 @@ -153,74 +162,108 @@ Filler 0xc90f6d50: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ [<c0104112>] sysenter_past_esp+0x5f/0x99 [<b7f7b410>] 0xb7f7b410 ======================= -@@@ SLUB kmalloc-8: Restoring redzone (0xcc) from 0xc90f6d28-0xc90f6d2b +FIX kmalloc-8: Restoring Redzone 0xc90f6d28-0xc90f6d2b=0xcc +If SLUB encounters a corrupted object (full detection requires the kernel +to be booted with slub_debug) then the following output will be dumped +into the syslog: -If SLUB encounters a corrupted object then it will perform the following -actions: - -1. Isolation and report of the issue +1. Description of the problem encountered This will be a message in the system log starting with -*** SLUB <slab cache affected>: <What went wrong>@<object address> -offset=<offset of object into slab> flags=<slabflags> -inuse=<objects in use in this slab> freelist=<first free object in slab> +=============================================== +BUG <slab cache affected>: <What went wrong> +----------------------------------------------- -2. Report on how the problem was dealt with in order to ensure the continued -operation of the system. +INFO: <corruption start>-<corruption_end> <more info> +INFO: Slab <address> <slab information> +INFO: Object <address> <object information> +INFO: Allocated in <kernel function> age=<jiffies since alloc> cpu=<allocated by + cpu> pid=<pid of the process> +INFO: Freed in <kernel function> age=<jiffies since free> cpu=<freed by cpu> + pid=<pid of the process> -These are messages in the system log beginning with - -@@@ SLUB <slab cache affected>: <corrective action taken> +(Object allocation / free information is only available if SLAB_STORE_USER is +set for the slab. slub_debug sets that option) +2. The object contents if an object was involved. -In the above sample SLUB found that the Redzone of an active object has -been overwritten. Here a string of 8 characters was written into a slab that -has the length of 8 characters. However, a 8 character string needs a -terminating 0. That zero has overwritten the first byte of the Redzone field. -After reporting the details of the issue encountered the @@@ SLUB message -tell us that SLUB has restored the redzone to its proper value and then -system operations continue. - -Various types of lines can follow the @@@ SLUB line: +Various types of lines can follow the BUG SLUB line: Bytes b4 <address> : <bytes> - Show a few bytes before the object where the problem was detected. + Shows a few bytes before the object where the problem was detected. Can be useful if the corruption does not stop with the start of the object. Object <address> : <bytes> The bytes of the object. If the object is inactive then the bytes - typically contain poisoning values. Any non-poison value shows a + typically contain poison values. Any non-poison value shows a corruption by a write after free. Redzone <address> : <bytes> - The redzone following the object. The redzone is used to detect + The Redzone following the object. The Redzone is used to detect writes after the object. All bytes should always have the same value. If there is any deviation then it is due to a write after the object boundary. -Freepointer - The pointer to the next free object in the slab. May become - corrupted if overwriting continues after the red zone. - -Last alloc: -Last free: - Shows the address from which the object was allocated/freed last. - We note the pid, the time and the CPU that did so. This is usually - the most useful information to figure out where things went wrong. - Here get_modalias() did an kmalloc(8) instead of a kmalloc(9). + (Redzone information is only available if SLAB_RED_ZONE is set. + slub_debug sets that option) -Filler <address> : <bytes> +Padding <address> : <bytes> Unused data to fill up the space in order to get the next object properly aligned. In the debug case we make sure that there are - at least 4 bytes of filler. This allow for the detection of writes + at least 4 bytes of padding. This allows the detection of writes before the object. -Following the filler will be a stackdump. That stackdump describes the -location where the error was detected. The cause of the corruption is more -likely to be found by looking at the information about the last alloc / free. +3. A stackdump + +The stackdump describes the location where the error was detected. The cause +of the corruption is may be more likely found by looking at the function that +allocated or freed the object. + +4. Report on how the problem was dealt with in order to ensure the continued +operation of the system. + +These are messages in the system log beginning with + +FIX <slab cache affected>: <corrective action taken> + +In the above sample SLUB found that the Redzone of an active object has +been overwritten. Here a string of 8 characters was written into a slab that +has the length of 8 characters. However, a 8 character string needs a +terminating 0. That zero has overwritten the first byte of the Redzone field. +After reporting the details of the issue encountered the FIX SLUB message +tell us that SLUB has restored the Redzone to its proper value and then +system operations continue. + +Emergency operations: +--------------------- + +Minimal debugging (sanity checks alone) can be enabled by booting with + + slub_debug=F + +This will be generally be enough to enable the resiliency features of slub +which will keep the system running even if a bad kernel component will +keep corrupting objects. This may be important for production systems. +Performance will be impacted by the sanity checks and there will be a +continual stream of error messages to the syslog but no additional memory +will be used (unlike full debugging). + +No guarantees. The kernel component still needs to be fixed. Performance +may be optimized further by locating the slab that experiences corruption +and enabling debugging only for that cache + +I.e. + + slub_debug=F,dentry + +If the corruption occurs by writing after the end of the object then it +may be advisable to enable a Redzone to avoid corrupting the beginning +of other objects. + + slub_debug=FZ,dentry -Christoph Lameter, <clameter@sgi.com>, May 23, 2007 +Christoph Lameter, <clameter@sgi.com>, May 30, 2007 diff --git a/Documentation/volatile-considered-harmful.txt b/Documentation/volatile-considered-harmful.txt new file mode 100644 index 000000000000..10c2e411cca8 --- /dev/null +++ b/Documentation/volatile-considered-harmful.txt @@ -0,0 +1,119 @@ +Why the "volatile" type class should not be used +------------------------------------------------ + +C programmers have often taken volatile to mean that the variable could be +changed outside of the current thread of execution; as a result, they are +sometimes tempted to use it in kernel code when shared data structures are +being used. In other words, they have been known to treat volatile types +as a sort of easy atomic variable, which they are not. The use of volatile in +kernel code is almost never correct; this document describes why. + +The key point to understand with regard to volatile is that its purpose is +to suppress optimization, which is almost never what one really wants to +do. In the kernel, one must protect shared data structures against +unwanted concurrent access, which is very much a different task. The +process of protecting against unwanted concurrency will also avoid almost +all optimization-related problems in a more efficient way. + +Like volatile, the kernel primitives which make concurrent access to data +safe (spinlocks, mutexes, memory barriers, etc.) are designed to prevent +unwanted optimization. If they are being used properly, there will be no +need to use volatile as well. If volatile is still necessary, there is +almost certainly a bug in the code somewhere. In properly-written kernel +code, volatile can only serve to slow things down. + +Consider a typical block of kernel code: + + spin_lock(&the_lock); + do_something_on(&shared_data); + do_something_else_with(&shared_data); + spin_unlock(&the_lock); + +If all the code follows the locking rules, the value of shared_data cannot +change unexpectedly while the_lock is held. Any other code which might +want to play with that data will be waiting on the lock. The spinlock +primitives act as memory barriers - they are explicitly written to do so - +meaning that data accesses will not be optimized across them. So the +compiler might think it knows what will be in shared_data, but the +spin_lock() call, since it acts as a memory barrier, will force it to +forget anything it knows. There will be no optimization problems with +accesses to that data. + +If shared_data were declared volatile, the locking would still be +necessary. But the compiler would also be prevented from optimizing access +to shared_data _within_ the critical section, when we know that nobody else +can be working with it. While the lock is held, shared_data is not +volatile. When dealing with shared data, proper locking makes volatile +unnecessary - and potentially harmful. + +The volatile storage class was originally meant for memory-mapped I/O +registers. Within the kernel, register accesses, too, should be protected +by locks, but one also does not want the compiler "optimizing" register +accesses within a critical section. But, within the kernel, I/O memory +accesses are always done through accessor functions; accessing I/O memory +directly through pointers is frowned upon and does not work on all +architectures. Those accessors are written to prevent unwanted +optimization, so, once again, volatile is unnecessary. + +Another situation where one might be tempted to use volatile is +when the processor is busy-waiting on the value of a variable. The right +way to perform a busy wait is: + + while (my_variable != what_i_want) + cpu_relax(); + +The cpu_relax() call can lower CPU power consumption or yield to a +hyperthreaded twin processor; it also happens to serve as a memory barrier, +so, once again, volatile is unnecessary. Of course, busy-waiting is +generally an anti-social act to begin with. + +There are still a few rare situations where volatile makes sense in the +kernel: + + - The above-mentioned accessor functions might use volatile on + architectures where direct I/O memory access does work. Essentially, + each accessor call becomes a little critical section on its own and + ensures that the access happens as expected by the programmer. + + - Inline assembly code which changes memory, but which has no other + visible side effects, risks being deleted by GCC. Adding the volatile + keyword to asm statements will prevent this removal. + + - The jiffies variable is special in that it can have a different value + every time it is referenced, but it can be read without any special + locking. So jiffies can be volatile, but the addition of other + variables of this type is strongly frowned upon. Jiffies is considered + to be a "stupid legacy" issue (Linus's words) in this regard; fixing it + would be more trouble than it is worth. + + - Pointers to data structures in coherent memory which might be modified + by I/O devices can, sometimes, legitimately be volatile. A ring buffer + used by a network adapter, where that adapter changes pointers to + indicate which descriptors have been processed, is an example of this + type of situation. + +For most code, none of the above justifications for volatile apply. As a +result, the use of volatile is likely to be seen as a bug and will bring +additional scrutiny to the code. Developers who are tempted to use +volatile should take a step back and think about what they are truly trying +to accomplish. + +Patches to remove volatile variables are generally welcome - as long as +they come with a justification which shows that the concurrency issues have +been properly thought through. + + +NOTES +----- + +[1] http://lwn.net/Articles/233481/ +[2] http://lwn.net/Articles/233482/ + +CREDITS +------- + +Original impetus and research by Randy Dunlap +Written by Jonathan Corbet +Improvements via coments from Satyam Sharma, Johannes Stezenbach, Jesper + Juhl, Heikki Orsila, H. Peter Anvin, Philipp Hahn, and Stefan + Richter. diff --git a/Documentation/watchdog/pcwd-watchdog.txt b/Documentation/watchdog/pcwd-watchdog.txt index d9ee6336c1d4..4f68052395c0 100644 --- a/Documentation/watchdog/pcwd-watchdog.txt +++ b/Documentation/watchdog/pcwd-watchdog.txt @@ -1,3 +1,5 @@ +Last reviewed: 10/05/2007 + Berkshire Products PC Watchdog Card Support for ISA Cards Revision A and C Documentation and Driver by Ken Hollis <kenji@bitgate.com> @@ -14,8 +16,8 @@ The Watchdog Driver will automatically find your watchdog card, and will attach a running driver for use with that card. After the watchdog - drivers have initialized, you can then talk to the card using the PC - Watchdog program, available from http://ftp.bitgate.com/pcwd/. + drivers have initialized, you can then talk to the card using a PC + Watchdog program. I suggest putting a "watchdog -d" before the beginning of an fsck, and a "watchdog -e -t 1" immediately after the end of an fsck. (Remember @@ -62,5 +64,3 @@ -- Ken Hollis (kenji@bitgate.com) -(This documentation may be out of date. Check - http://ftp.bitgate.com/pcwd/ for the absolute latest additions.) diff --git a/Documentation/watchdog/watchdog-api.txt b/Documentation/watchdog/watchdog-api.txt index 8d16f6f3c4ec..bb7cb1d31ec7 100644 --- a/Documentation/watchdog/watchdog-api.txt +++ b/Documentation/watchdog/watchdog-api.txt @@ -1,3 +1,6 @@ +Last reviewed: 10/05/2007 + + The Linux Watchdog driver API. Copyright 2002 Christer Weingel <wingel@nano-system.com> @@ -22,7 +25,7 @@ the system. If userspace fails (RAM error, kernel bug, whatever), the notifications cease to occur, and the hardware watchdog will reset the system (causing a reboot) after the timeout occurs. -The Linux watchdog API is a rather AD hoc construction and different +The Linux watchdog API is a rather ad-hoc construction and different drivers implement different, and sometimes incompatible, parts of it. This file is an attempt to document the existing usage and allow future driver writers to use it as a reference. @@ -46,14 +49,16 @@ some of the drivers support the configuration option "Disable watchdog shutdown on close", CONFIG_WATCHDOG_NOWAYOUT. If it is set to Y when compiling the kernel, there is no way of disabling the watchdog once it has been started. So, if the watchdog daemon crashes, the system -will reboot after the timeout has passed. +will reboot after the timeout has passed. Watchdog devices also usually +support the nowayout module parameter so that this option can be controlled +at runtime. -Some other drivers will not disable the watchdog, unless a specific -magic character 'V' has been sent /dev/watchdog just before closing -the file. If the userspace daemon closes the file without sending -this special character, the driver will assume that the daemon (and -userspace in general) died, and will stop pinging the watchdog without -disabling it first. This will then cause a reboot. +Drivers will not disable the watchdog, unless a specific magic character 'V' +has been sent /dev/watchdog just before closing the file. If the userspace +daemon closes the file without sending this special character, the driver +will assume that the daemon (and userspace in general) died, and will stop +pinging the watchdog without disabling it first. This will then cause a +reboot if the watchdog is not re-opened in sufficient time. The ioctl API: @@ -227,218 +232,3 @@ The following options are available: [FIXME -- better explanations] -Implementations in the current drivers in the kernel tree: - -Here I have tried to summarize what the different drivers support and -where they do strange things compared to the other drivers. - -acquirewdt.c -- Acquire Single Board Computer - - This driver has a hardcoded timeout of 1 minute - - Supports CONFIG_WATCHDOG_NOWAYOUT - - GETSUPPORT returns KEEPALIVEPING. GETSTATUS will return 1 if - the device is open, 0 if not. [FIXME -- isn't this rather - silly? To be able to use the ioctl, the device must be open - and so GETSTATUS will always return 1]. - -advantechwdt.c -- Advantech Single Board Computer - - Timeout that defaults to 60 seconds, supports SETTIMEOUT. - - Supports CONFIG_WATCHDOG_NOWAYOUT - - GETSUPPORT returns WDIOF_KEEPALIVEPING and WDIOF_SETTIMEOUT. - The GETSTATUS call returns if the device is open or not. - [FIXME -- silliness again?] - -booke_wdt.c -- PowerPC BookE Watchdog Timer - - Timeout default varies according to frequency, supports - SETTIMEOUT - - Watchdog cannot be turned off, CONFIG_WATCHDOG_NOWAYOUT - does not make sense - - GETSUPPORT returns the watchdog_info struct, and - GETSTATUS returns the supported options. GETBOOTSTATUS - returns a 1 if the last reset was caused by the - watchdog and a 0 otherwise. This watchdog cannot be - disabled once it has been started. The wdt_period kernel - parameter selects which bit of the time base changing - from 0->1 will trigger the watchdog exception. Changing - the timeout from the ioctl calls will change the - wdt_period as defined above. Finally if you would like to - replace the default Watchdog Handler you can implement the - WatchdogHandler() function in your own code. - -eurotechwdt.c -- Eurotech CPU-1220/1410 - - The timeout can be set using the SETTIMEOUT ioctl and defaults - to 60 seconds. - - Also has a module parameter "ev", event type which controls - what should happen on a timeout, the string "int" or anything - else that causes a reboot. [FIXME -- better description] - - Supports CONFIG_WATCHDOG_NOWAYOUT - - GETSUPPORT returns CARDRESET and WDIOF_SETTIMEOUT but - GETSTATUS is not supported and GETBOOTSTATUS just returns 0. - -i810-tco.c -- Intel 810 chipset - - Also has support for a lot of other i8x0 stuff, but the - watchdog is one of the things. - - The timeout is set using the module parameter "i810_margin", - which is in steps of 0.6 seconds where 2<i810_margin<64. The - driver supports the SETTIMEOUT ioctl. - - Supports CONFIG_WATCHDOG_NOWAYOUT. - - GETSUPPORT returns WDIOF_SETTIMEOUT. The GETSTATUS call - returns some kind of timer value which ist not compatible with - the other drivers. GETBOOT status returns some kind of - hardware specific boot status. [FIXME -- describe this] - -ib700wdt.c -- IB700 Single Board Computer - - Default timeout of 30 seconds and the timeout is settable - using the SETTIMEOUT ioctl. Note that only a few timeout - values are supported. - - Supports CONFIG_WATCHDOG_NOWAYOUT - - GETSUPPORT returns WDIOF_KEEPALIVEPING and WDIOF_SETTIMEOUT. - The GETSTATUS call returns if the device is open or not. - [FIXME -- silliness again?] - -machzwd.c -- MachZ ZF-Logic - - Hardcoded timeout of 10 seconds - - Has a module parameter "action" that controls what happens - when the timeout runs out which can be 0 = RESET (default), - 1 = SMI, 2 = NMI, 3 = SCI. - - Supports CONFIG_WATCHDOG_NOWAYOUT and the magic character - 'V' close handling. - - GETSUPPORT returns WDIOF_KEEPALIVEPING, and the GETSTATUS call - returns if the device is open or not. [FIXME -- silliness - again?] - -mixcomwd.c -- MixCom Watchdog - - [FIXME -- I'm unable to tell what the timeout is] - - Supports CONFIG_WATCHDOG_NOWAYOUT - - GETSUPPORT returns WDIOF_KEEPALIVEPING, GETSTATUS returns if - the device is opened or not [FIXME -- I'm not really sure how - this works, there seems to be some magic connected to - CONFIG_WATCHDOG_NOWAYOUT] - -pcwd.c -- Berkshire PC Watchdog - - Hardcoded timeout of 1.5 seconds - - Supports CONFIG_WATCHDOG_NOWAYOUT - - GETSUPPORT returns WDIOF_OVERHEAT|WDIOF_CARDRESET and both - GETSTATUS and GETBOOTSTATUS return something useful. - - The SETOPTIONS call can be used to enable and disable the card - and to ask the driver to call panic if the system overheats. - -sbc60xxwdt.c -- 60xx Single Board Computer - - Hardcoded timeout of 10 seconds - - Does not support CONFIG_WATCHDOG_NOWAYOUT, but has the magic - character 'V' close handling. - - No bits set in GETSUPPORT - -scx200.c -- National SCx200 CPUs - - Not in the kernel yet. - - The timeout is set using a module parameter "margin" which - defaults to 60 seconds. The timeout can also be set using - SETTIMEOUT and read using GETTIMEOUT. - - Supports a module parameter "nowayout" that is initialized - with the value of CONFIG_WATCHDOG_NOWAYOUT. Also supports the - magic character 'V' handling. - -shwdt.c -- SuperH 3/4 processors - - [FIXME -- I'm unable to tell what the timeout is] - - Supports CONFIG_WATCHDOG_NOWAYOUT - - GETSUPPORT returns WDIOF_KEEPALIVEPING, and the GETSTATUS call - returns if the device is open or not. [FIXME -- silliness - again?] - -softdog.c -- Software watchdog - - The timeout is set with the module parameter "soft_margin" - which defaults to 60 seconds, the timeout is also settable - using the SETTIMEOUT ioctl. - - Supports CONFIG_WATCHDOG_NOWAYOUT - - WDIOF_SETTIMEOUT bit set in GETSUPPORT - -w83877f_wdt.c -- W83877F Computer - - Hardcoded timeout of 30 seconds - - Does not support CONFIG_WATCHDOG_NOWAYOUT, but has the magic - character 'V' close handling. - - No bits set in GETSUPPORT - -w83627hf_wdt.c -- w83627hf watchdog - - Timeout that defaults to 60 seconds, supports SETTIMEOUT. - - Supports CONFIG_WATCHDOG_NOWAYOUT - - GETSUPPORT returns WDIOF_KEEPALIVEPING and WDIOF_SETTIMEOUT. - The GETSTATUS call returns if the device is open or not. - -wdt.c -- ICS WDT500/501 ISA and -wdt_pci.c -- ICS WDT500/501 PCI - - Default timeout of 60 seconds. The timeout is also settable - using the SETTIMEOUT ioctl. - - Supports CONFIG_WATCHDOG_NOWAYOUT - - GETSUPPORT returns with bits set depending on the actual - card. The WDT501 supports a lot of external monitoring, the - WDT500 much less. - -wdt285.c -- Footbridge watchdog - - The timeout is set with the module parameter "soft_margin" - which defaults to 60 seconds. The timeout is also settable - using the SETTIMEOUT ioctl. - - Does not support CONFIG_WATCHDOG_NOWAYOUT - - WDIOF_SETTIMEOUT bit set in GETSUPPORT - -wdt977.c -- Netwinder W83977AF chip - - Hardcoded timeout of 3 minutes - - Supports CONFIG_WATCHDOG_NOWAYOUT - - Does not support any ioctls at all. - diff --git a/Documentation/watchdog/watchdog.txt b/Documentation/watchdog/watchdog.txt deleted file mode 100644 index 4b1ff69cc19a..000000000000 --- a/Documentation/watchdog/watchdog.txt +++ /dev/null @@ -1,94 +0,0 @@ - Watchdog Timer Interfaces For The Linux Operating System - - Alan Cox <alan@lxorguk.ukuu.org.uk> - - Custom Linux Driver And Program Development - - -The following watchdog drivers are currently implemented: - - ICS WDT501-P - ICS WDT501-P (no fan tachometer) - ICS WDT500-P - Software Only - SA1100 Internal Watchdog - Berkshire Products PC Watchdog Revision A & C (by Ken Hollis) - - -All six interfaces provide /dev/watchdog, which when open must be written -to within a timeout or the machine will reboot. Each write delays the reboot -time another timeout. In the case of the software watchdog the ability to -reboot will depend on the state of the machines and interrupts. The hardware -boards physically pull the machine down off their own onboard timers and -will reboot from almost anything. - -A second temperature monitoring interface is available on the WDT501P cards -and some Berkshire cards. This provides /dev/temperature. This is the machine -internal temperature in degrees Fahrenheit. Each read returns a single byte -giving the temperature. - -The third interface logs kernel messages on additional alert events. - -Both software and hardware watchdog drivers are available in the standard -kernel. If you are using the software watchdog, you probably also want -to use "panic=60" as a boot argument as well. - -The wdt card cannot be safely probed for. Instead you need to pass -wdt=ioaddr,irq as a boot parameter - eg "wdt=0x240,11". - -The SA1100 watchdog module can be configured with the "sa1100_margin" -commandline argument which specifies timeout value in seconds. - -The i810 TCO watchdog modules can be configured with the "i810_margin" -commandline argument which specifies the counter initial value. The counter -is decremented every 0.6 seconds and default to 50 (30 seconds). Values can -range between 3 and 63. - -The i810 TCO watchdog driver also implements the WDIOC_GETSTATUS and -WDIOC_GETBOOTSTATUS ioctl()s. WDIOC_GETSTATUS returns the actual counter value -and WDIOC_GETBOOTSTATUS returns the value of TCO2 Status Register (see Intel's -documentation for the 82801AA and 82801AB datasheet). - -Features --------- - WDT501P WDT500P Software Berkshire i810 TCO SA1100WD -Reboot Timer X X X X X X -External Reboot X X o o o X -I/O Port Monitor o o o X o o -Temperature X o o X o o -Fan Speed X o o o o o -Power Under X o o o o o -Power Over X o o o o o -Overheat X o o o o o - -The external event interfaces on the WDT boards are not currently supported. -Minor numbers are however allocated for it. - - -Example Watchdog Driver: see Documentation/watchdog/src/watchdog-simple.c - - -Contact Information - -People keep asking about the WDT watchdog timer hardware: The phone contacts -for Industrial Computer Source are: - -Industrial Computer Source -http://www.indcompsrc.com -ICS Advent, San Diego -6260 Sequence Dr. -San Diego, CA 92121-4371 -Phone (858) 677-0877 -FAX: (858) 677-0895 -> -ICS Advent Europe, UK -Oving Road -Chichester, -West Sussex, -PO19 4ET, UK -Phone: 00.44.1243.533900 - - -and please mention Linux when enquiring. - -For full information about the PCWD cards see the pcwd-watchdog.txt document. diff --git a/Documentation/watchdog/wdt.txt b/Documentation/watchdog/wdt.txt new file mode 100644 index 000000000000..03fd756d976d --- /dev/null +++ b/Documentation/watchdog/wdt.txt @@ -0,0 +1,43 @@ +Last Reviewed: 10/05/2007 + + WDT Watchdog Timer Interfaces For The Linux Operating System + Alan Cox <alan@lxorguk.ukuu.org.uk> + + ICS WDT501-P + ICS WDT501-P (no fan tachometer) + ICS WDT500-P + +All the interfaces provide /dev/watchdog, which when open must be written +to within a timeout or the machine will reboot. Each write delays the reboot +time another timeout. In the case of the software watchdog the ability to +reboot will depend on the state of the machines and interrupts. The hardware +boards physically pull the machine down off their own onboard timers and +will reboot from almost anything. + +A second temperature monitoring interface is available on the WDT501P cards +This provides /dev/temperature. This is the machine internal temperature in +degrees Fahrenheit. Each read returns a single byte giving the temperature. + +The third interface logs kernel messages on additional alert events. + +The wdt card cannot be safely probed for. Instead you need to pass +wdt=ioaddr,irq as a boot parameter - eg "wdt=0x240,11". + +Features +-------- + WDT501P WDT500P +Reboot Timer X X +External Reboot X X +I/O Port Monitor o o +Temperature X o +Fan Speed X o +Power Under X o +Power Over X o +Overheat X o + +The external event interfaces on the WDT boards are not currently supported. +Minor numbers are however allocated for it. + + +Example Watchdog Driver: see Documentation/watchdog/src/watchdog-simple.c + diff --git a/Documentation/x86_64/boot-options.txt b/Documentation/x86_64/boot-options.txt index 6177d881983f..945311840a10 100644 --- a/Documentation/x86_64/boot-options.txt +++ b/Documentation/x86_64/boot-options.txt @@ -14,9 +14,11 @@ Machine check mce=nobootlog Disable boot machine check logging. mce=tolerancelevel (number) - 0: always panic, 1: panic if deadlock possible, - 2: try to avoid panic, 3: never panic or exit (for testing) - default is 1 + 0: always panic on uncorrected errors, log corrected errors + 1: panic or SIGBUS on uncorrected errors, log corrected errors + 2: SIGBUS or log uncorrected errors, log corrected errors + 3: never panic or SIGBUS, log all errors (for testing only) + Default is 1 Can be also set using sysfs which is preferable. nomce (for compatibility with i386): same as mce=off @@ -134,12 +136,6 @@ Non Executable Mappings SMP - nosmp Only use a single CPU - - maxcpus=NUMBER only use upto NUMBER CPUs - - cpumask=MASK only use cpus with bits set in mask - additional_cpus=NUM Allow NUM more CPUs for hotplug (defaults are specified by the BIOS, see Documentation/x86_64/cpu-hotplug-spec) diff --git a/Documentation/x86_64/machinecheck b/Documentation/x86_64/machinecheck index feaeaf6f6e4d..a05e58e7b159 100644 --- a/Documentation/x86_64/machinecheck +++ b/Documentation/x86_64/machinecheck @@ -49,12 +49,14 @@ tolerant Since machine check exceptions can happen any time it is sometimes risky for the kernel to kill a process because it defies normal kernel locking rules. The tolerance level configures - how hard the kernel tries to recover even at some risk of deadlock. - - 0: always panic, - 1: panic if deadlock possible, - 2: try to avoid panic, - 3: never panic or exit (for testing only) + how hard the kernel tries to recover even at some risk of + deadlock. Higher tolerant values trade potentially better uptime + with the risk of a crash or even corruption (for tolerant >= 3). + + 0: always panic on uncorrected errors, log corrected errors + 1: panic or SIGBUS on uncorrected errors, log corrected errors + 2: SIGBUS or log uncorrected errors, log corrected errors + 3: never panic or SIGBUS, log all errors (for testing only) Default: 1 diff --git a/Documentation/zh_CN/HOWTO b/Documentation/zh_CN/HOWTO new file mode 100644 index 000000000000..48fc67bfbe3d --- /dev/null +++ b/Documentation/zh_CN/HOWTO @@ -0,0 +1,536 @@ +Chinese translated version of Documentation/HOWTO + +If you have any comment or update to the content, please contact the +original document maintainer directly. However, if you have problem +communicating in English you can also ask the Chinese maintainer for +help. Contact the Chinese maintainer, if this translation is outdated +or there is problem with translation. + +Maintainer: Greg Kroah-Hartman <greg@kroah.com> +Chinese maintainer: Li Yang <leoli@freescale.com> +--------------------------------------------------------------------- +Documentation/HOWTO 的中文翻译 + +如果想评论或更新本文的内容,请直接联系原文档的维护者。如果你使用英文 +交流有困难的话,也可以向中文版维护者求助。如果本翻译更新不及时或者翻 +译存在问题,请联系中文版维护者。 + +英文版维护者: Greg Kroah-Hartman <greg@kroah.com> +中文版维护者: 李阳 Li Yang <leoli@freescale.com> +中文版翻译者: 李阳 Li Yang <leoli@freescale.com> +中文版校译者: 钟宇 TripleX Chung <xxx.phy@gmail.com> + 陈琦 Maggie Chen <chenqi@beyondsoft.com> + 王聪 Wang Cong <xiyou.wangcong@gmail.com> + +以下为正文 +--------------------------------------------------------------------- + +如何参与Linux内核开发 +--------------------- + +这是一篇将如何参与Linux内核开发的相关问题一网打尽的终极秘笈。它将指导你 +成为一名Linux内核开发者,并且学会如何同Linux内核开发社区合作。它尽可能不 +包括任何关于内核编程的技术细节,但会给你指引一条获得这些知识的正确途径。 + +如果这篇文章中的任何内容不再适用,请给文末列出的文件维护者发送补丁。 + + +入门 +---- + +你想了解如何成为一名Linux内核开发者?或者老板吩咐你“给这个设备写个Linux +驱动程序”?这篇文章的目的就是教会你达成这些目标的全部诀窍,它将描述你需 +要经过的流程以及给出如何同内核社区合作的一些提示。它还将试图解释内核社区 +为何这样运作。 + +Linux内核大部分是由C语言写成的,一些体系结构相关的代码用到了汇编语言。要 +参与内核开发,你必须精通C语言。除非你想为某个架构开发底层代码,否则你并 +不需要了解(任何体系结构的)汇编语言。下面列举的书籍虽然不能替代扎实的C +语言教育和多年的开发经验,但如果需要的话,做为参考还是不错的: + - "The C Programming Language" by Kernighan and Ritchie [Prentice Hall] + 《C程序设计语言(第2版·新版)》(徐宝文 李志 译)[机械工业出版社] + - "Practical C Programming" by Steve Oualline [O'Reilly] + 《实用C语言编程(第三版)》(郭大海 译)[中国电力出版社] + - "C: A Reference Manual" by Harbison and Steele [Prentice Hall] + 《C语言参考手册(原书第5版)》(邱仲潘 等译)[机械工业出版社] + +Linux内核使用GNU C和GNU工具链开发。虽然它遵循ISO C89标准,但也用到了一些 +标准中没有定义的扩展。内核是自给自足的C环境,不依赖于标准C库的支持,所以 +并不支持C标准中的部分定义。比如long long类型的大数除法和浮点运算就不允许 +使用。有时候确实很难弄清楚内核对工具链的要求和它所使用的扩展,不幸的是目 +前还没有明确的参考资料可以解释它们。请查阅gcc信息页(使用“info gcc”命令 +显示)获得一些这方面信息。 + +请记住你是在学习怎么和已经存在的开发社区打交道。它由一群形形色色的人组成, +他们对代码、风格和过程有着很高的标准。这些标准是在长期实践中总结出来的, +适应于地理上分散的大型开发团队。它们已经被很好得整理成档,建议你在开发 +之前尽可能多的学习这些标准,而不要期望别人来适应你或者你公司的行为方式。 + + +法律问题 +-------- + +Linux内核源代码都是在GPL(通用公共许可证)的保护下发布的。要了解这种许可 +的细节请查看源代码主目录下的COPYING文件。如果你对它还有更深入问题请联系 +律师,而不要在Linux内核邮件组上提问。因为邮件组里的人并不是律师,不要期 +望他们的话有法律效力。 + +对于GPL的常见问题和解答,请访问以下链接: + http://www.gnu.org/licenses/gpl-faq.html + + +文档 +---- + +Linux内核代码中包含有大量的文档。这些文档对于学习如何与内核社区互动有着 +不可估量的价值。当一个新的功能被加入内核,最好把解释如何使用这个功能的文 +档也放进内核。当内核的改动导致面向用户空间的接口发生变化时,最好将相关信 +息或手册页(manpages)的补丁发到mtk-manpages@gmx.net,以向手册页(manpages) +的维护者解释这些变化。 + +以下是内核代码中需要阅读的文档: + README + 文件简要介绍了Linux内核的背景,并且描述了如何配置和编译内核。内核的 + 新用户应该从这里开始。 + + Documentation/Changes + 文件给出了用来编译和使用内核所需要的最小软件包列表。 + + Documentation/CodingStyle + 描述Linux内核的代码风格和理由。所有新代码需要遵守这篇文档中定义的规 + 范。大多数维护者只会接收符合规定的补丁,很多人也只会帮忙检查符合风格 + 的代码。 + + Documentation/SubmittingPatches + Documentation/SubmittingDrivers + 这两份文档明确描述如何创建和发送补丁,其中包括(但不仅限于): + - 邮件内容 + - 邮件格式 + - 选择收件人 + 遵守这些规定并不能保证提交成功(因为所有补丁需要通过严格的内容和风格 + 审查),但是忽视他们几乎就意味着失败。 + + 其他关于如何正确地生成补丁的优秀文档包括: + "The Perfect Patch" + http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt + "Linux kernel patch submission format" + http://linux.yyz.us/patch-format.html + + Documentation/stable_api_nonsense.txt + 论证内核为什么特意不包括稳定的内核内部API,也就是说不包括像这样的特 + 性: + - 子系统中间层(为了兼容性?) + - 在不同操作系统间易于移植的驱动程序 + - 减缓(甚至阻止)内核代码的快速变化 + 这篇文档对于理解Linux的开发哲学至关重要。对于将开发平台从其他操作系 + 统转移到Linux的人来说也很重要。 + + Documentation/SecurityBugs + 如果你认为自己发现了Linux内核的安全性问题,请根据这篇文档中的步骤来 + 提醒其他内核开发者并帮助解决这个问题。 + + Documentation/ManagementStyle + 描述内核维护者的工作方法及其共有特点。这对于刚刚接触内核开发(或者对 + 它感到好奇)的人来说很重要,因为它解释了很多对于内核维护者独特行为的 + 普遍误解与迷惑。 + + Documentation/stable_kernel_rules.txt + 解释了稳定版内核发布的规则,以及如何将改动放入这些版本的步骤。 + + Documentation/kernel-docs.txt + 有助于内核开发的外部文档列表。如果你在内核自带的文档中没有找到你想找 + 的内容,可以查看这些文档。 + + Documentation/applying-patches.txt + 关于补丁是什么以及如何将它打在不同内核开发分支上的好介绍 + +内核还拥有大量从代码自动生成的文档。它包含内核内部API的全面介绍以及如何 +妥善处理加锁的规则。生成的文档会放在 Documentation/DocBook/目录下。在内 +核源码的主目录中使用以下不同命令将会分别生成PDF、Postscript、HTML和手册 +页等不同格式的文档: + make pdfdocs + make psdocs + make htmldocs + make mandocs + + +如何成为内核开发者 +------------------ +如果你对Linux内核开发一无所知,你应该访问“Linux内核新手”计划: + http://kernelnewbies.org +它拥有一个可以问各种最基本的内核开发问题的邮件列表(在提问之前一定要记得 +查找已往的邮件,确认是否有人已经回答过相同的问题)。它还拥有一个可以获得 +实时反馈的IRC聊天频道,以及大量对于学习Linux内核开发相当有帮助的文档。 + +网站简要介绍了源代码组织结构、子系统划分以及目前正在进行的项目(包括内核 +中的和单独维护的)。它还提供了一些基本的帮助信息,比如如何编译内核和打补 +丁。 + +如果你想加入内核开发社区并协助完成一些任务,却找不到从哪里开始,可以访问 +“Linux内核房管员”计划: + http://janitor.kernelnewbies.org/ +这是极佳的起点。它提供一个相对简单的任务列表,列出内核代码中需要被重新 +整理或者改正的地方。通过和负责这个计划的开发者们一同工作,你会学到将补丁 +集成进内核的基本原理。如果还没有决定下一步要做什么的话,你还可能会得到方 +向性的指点。 + +如果你已经有一些现成的代码想要放到内核中,但是需要一些帮助来使它们拥有正 +确的格式。请访问“内核导师”计划。这个计划就是用来帮助你完成这个目标的。它 +是一个邮件列表,地址如下: + http://selenic.com/mailman/listinfo/kernel-mentors + +在真正动手修改内核代码之前,理解要修改的代码如何运作是必需的。要达到这个 +目的,没什么办法比直接读代码更有效了(大多数花招都会有相应的注释),而且 +一些特制的工具还可以提供帮助。例如,“Linux代码交叉引用”项目就是一个值得 +特别推荐的帮助工具,它将源代码显示在有编目和索引的网页上。其中一个更新及 +时的内核源码库,可以通过以下地址访问: + http://sosdg.org/~coywolf/lxr/ + + +开发流程 +-------- + +目前Linux内核开发流程包括几个“主内核分支”和很多子系统相关的内核分支。这 +些分支包括: + - 2.6.x主内核源码树 + - 2.6.x.y -stable内核源码树 + - 2.6.x -git内核补丁集 + - 2.6.x -mm内核补丁集 + - 子系统相关的内核源码树和补丁集 + + +2.6.x内核主源码树 +----------------- +2.6.x内核是由Linus Torvalds(Linux的创造者)亲自维护的。你可以在 +kernel.org网站的pub/linux/kernel/v2.6/目录下找到它。它的开发遵循以下步 +骤: + - 每当一个新版本的内核被发布,为期两周的集成窗口将被打开。在这段时间里 + 维护者可以向Linus提交大段的修改,通常这些修改已经被放到-mm内核中几个 + 星期了。提交大量修改的首选方式是使用git工具(内核的代码版本管理工具 + ,更多的信息可以在http://git.or.cz/获取),不过使用普通补丁也是可以 + 的。 + - 两个星期以后-rc1版本内核发布。之后只有不包含可能影响整个内核稳定性的 + 新功能的补丁才可能被接受。请注意一个全新的驱动程序(或者文件系统)有 + 可能在-rc1后被接受是因为这样的修改完全独立,不会影响其他的代码,所以 + 没有造成内核退步的风险。在-rc1以后也可以用git向Linus提交补丁,不过所 + 有的补丁需要同时被发送到相应的公众邮件列表以征询意见。 + - 当Linus认为当前的git源码树已经达到一个合理健全的状态足以发布供人测试 + 时,一个新的-rc版本就会被发布。计划是每周都发布新的-rc版本。 + - 这个过程一直持续下去直到内核被认为达到足够稳定的状态,持续时间大概是 + 6个星期。 + +关于内核发布,值得一提的是Andrew Morton在linux-kernel邮件列表中如是说: + “没有人知道新内核何时会被发布,因为发布是根据已知bug的情况来决定 + 的,而不是根据一个事先制定好的时间表。” + + +2.6.x.y -stable(稳定版)内核源码树 +----------------------------------- +由4个数字组成的内核版本号说明此内核是-stable版本。它们包含基于2.6.x版本 +内核的相对较小且至关重要的修补,这些修补针对安全性问题或者严重的内核退步。 + +这种版本的内核适用于那些期望获得最新的稳定版内核并且不想参与测试开发版或 +者实验版的用户。 + +如果没有2.6.x.y版本内核存在,那么最新的2.6.x版本内核就相当于是当前的稳定 +版内核。 + +2.6.x.y版本由“稳定版”小组(邮件地址<stable@kernel.org>)维护,一般隔周发 +布新版本。 + +内核源码中的Documentation/stable_kernel_rules.txt文件具体描述了可被稳定 +版内核接受的修改类型以及发布的流程。 + + +2.6.x -git补丁集 +---------------- +Linus的内核源码树的每日快照,这个源码树是由git工具管理的(由此得名)。这 +些补丁通常每天更新以反映Linus的源码树的最新状态。它们比-rc版本的内核源码 +树更具试验性质,因为这个补丁集是全自动生成的,没有任何人来确认其是否真正 +健全。 + + +2.6.x -mm补丁集 +--------------- +这是由Andrew Morton维护的试验性内核补丁集。Andrew将所有子系统的内核源码 +和补丁拼凑到一起,并且加入了大量从linux-kernel邮件列表中采集的补丁。这个 +源码树是新功能和补丁的试炼场。当补丁在-mm补丁集里证明了其价值以后Andrew +或者相应子系统的维护者会将补丁发给Linus以便集成进主内核源码树。 + +在将所有新补丁发给Linus以集成到主内核源码树之前,我们非常鼓励先把这些补 +丁放在-mm版内核源码树中进行测试。 + +这些内核版本不适合在需要稳定运行的系统上运行,因为运行它们比运行任何其他 +内核分支都更具有风险。 + +如果你想为内核开发进程提供帮助,请尝试并使用这些内核版本,并在 +linux-kernel邮件列表中提供反馈,告诉大家你遇到了问题还是一切正常。 + +通常-mm版补丁集不光包括这些额外的试验性补丁,还包括发布时-git版主源码树 +中的改动。 + +-mm版内核没有固定的发布周期,但是通常在每两个-rc版内核发布之间都会有若干 +个-mm版内核发布(一般是1至3个)。 + + +子系统相关内核源码树和补丁集 +---------------------------- +相当一部分内核子系统开发者会公开他们自己的开发源码树,以便其他人能了解内 +核的不同领域正在发生的事情。如上所述,这些源码树会被集成到-mm版本内核中。 + +下面是目前可用的一些内核源码树的列表: + 通过git管理的源码树: + - Kbuild开发源码树, Sam Ravnborg <sam@ravnborg.org> + git.kernel.org:/pub/scm/linux/kernel/git/sam/kbuild.git + + - ACPI开发源码树, Len Brown <len.brown@intel.com> + git.kernel.org:/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6.git + + - 块设备开发源码树, Jens Axboe <axboe@suse.de> + git.kernel.org:/pub/scm/linux/kernel/git/axboe/linux-2.6-block.git + + - DRM开发源码树, Dave Airlie <airlied@linux.ie> + git.kernel.org:/pub/scm/linux/kernel/git/airlied/drm-2.6.git + + - ia64开发源码树, Tony Luck <tony.luck@intel.com> + git.kernel.org:/pub/scm/linux/kernel/git/aegl/linux-2.6.git + + - ieee1394开发源码树, Jody McIntyre <scjody@modernduck.com> + git.kernel.org:/pub/scm/linux/kernel/git/scjody/ieee1394.git + + - infiniband开发源码树, Roland Dreier <rolandd@cisco.com> + git.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git + + - libata开发源码树, Jeff Garzik <jgarzik@pobox.com> + git.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev.git + + - 网络驱动程序开发源码树, Jeff Garzik <jgarzik@pobox.com> + git.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git + + - pcmcia开发源码树, Dominik Brodowski <linux@dominikbrodowski.net> + git.kernel.org:/pub/scm/linux/kernel/git/brodo/pcmcia-2.6.git + + - SCSI开发源码树, James Bottomley <James.Bottomley@SteelEye.com> + git.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6.git + + 使用quilt管理的补丁集: + - USB, PCI, 驱动程序核心和I2C, Greg Kroah-Hartman <gregkh@suse.de> + kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/ + - x86-64, 部分i386, Andi Kleen <ak@suse.de> + ftp.firstfloor.org:/pub/ak/x86_64/quilt/ + + 其他内核源码树可以在http://git.kernel.org的列表中和MAINTAINERS文件里 + 找到。 + +报告bug +------- + +bugzilla.kernel.org是Linux内核开发者们用来跟踪内核Bug的网站。我们鼓励用 +户在这个工具中报告找到的所有bug。如何使用内核bugzilla的细节请访问: + http://test.kernel.org/bugzilla/faq.html + +内核源码主目录中的REPORTING-BUGS文件里有一个很好的模板。它指导用户如何报 +告可能的内核bug以及需要提供哪些信息来帮助内核开发者们找到问题的根源。 + + +利用bug报告 +----------- + +练习内核开发技能的最好办法就是修改其他人报告的bug。你不光可以帮助内核变 +得更加稳定,还可以学会如何解决实际问题从而提高自己的技能,并且让其他开发 +者感受到你的存在。修改bug是赢得其他开发者赞誉的最好办法,因为并不是很多 +人都喜欢浪费时间去修改别人报告的bug。 + +要尝试修改已知的bug,请访问http://bugzilla.kernel.org网址。如果你想获得 +最新bug的通知,可以订阅bugme-new邮件列表(只有新的bug报告会被寄到这里) +或者订阅bugme-janitor邮件列表(所有bugzilla的变动都会被寄到这里)。 + + http://lists.osdl.org/mailman/listinfo/bugme-new + http://lists.osdl.org/mailman/listinfo/bugme-janitors + + +邮件列表 +-------- + +正如上面的文档所描述,大多数的骨干内核开发者都加入了Linux Kernel邮件列 +表。如何订阅和退订列表的细节可以在这里找到: + http://vger.kernel.org/vger-lists.html#linux-kernel +网上很多地方都有这个邮件列表的存档(archive)。可以使用搜索引擎来找到这些 +存档。比如: + http://dir.gmane.org/gmane.linux.kernel +在发信之前,我们强烈建议你先在存档中搜索你想要讨论的问题。很多已经被详细 +讨论过的问题只在邮件列表的存档中可以找到。 + +大多数内核子系统也有自己独立的邮件列表来协调各自的开发工作。从 +MAINTAINERS文件中可以找到不同话题对应的邮件列表。 + +很多邮件列表架设在kernel.org服务器上。这些列表的信息可以在这里找到: + http://vger.kernel.org/vger-lists.html + +在使用这些邮件列表时,请记住保持良好的行为习惯。下面的链接提供了与这些列 +表(或任何其它邮件列表)交流的一些简单规则,虽然内容有点滥竽充数。 + http://www.albion.com/netiquette/ + +当有很多人回复你的邮件时,邮件的抄送列表会变得很长。请不要将任何人从抄送 +列表中删除,除非你有足够的理由这么做。也不要只回复到邮件列表。请习惯于同 +一封邮件接收两次(一封来自发送者一封来自邮件列表),而不要试图通过添加一 +些奇特的邮件头来解决这个问题,人们不会喜欢的。 + +记住保留你所回复内容的上下文和源头。在你回复邮件的顶部保留“某某某说到……” +这几行。将你的评论加在被引用的段落之间而不要放在邮件的顶部。 + +如果你在邮件中附带补丁,请确认它们是可以直接阅读的纯文本(如 +Documentation/SubmittingPatches文档中所述)。内核开发者们不希望遇到附件 +或者被压缩了的补丁。只有这样才能保证他们可以直接评论你的每行代码。请确保 +你使用的邮件发送程序不会修改空格和制表符。一个防范性的测试方法是先将邮件 +发送给自己,然后自己尝试是否可以顺利地打上收到的补丁。如果测试不成功,请 +调整或者更换你的邮件发送程序直到它正确工作为止。 + +总而言之,请尊重其他的邮件列表订阅者。 + + +同内核社区合作 +---------------- + +内核社区的目标就是提供尽善尽美的内核。所以当你提交补丁期望被接受进内核的 +时候,它的技术价值以及其他方面都将被评审。那么你可能会得到什么呢? + - 批评 + - 评论 + - 要求修改 + - 要求证明修改的必要性 + - 沉默 + +要记住,这些是把补丁放进内核的正常情况。你必须学会听取对补丁的批评和评论, +从技术层面评估它们,然后要么重写你的补丁要么简明扼要地论证修改是不必要 +的。如果你发的邮件没有得到任何回应,请过几天后再试一次,因为有时信件会湮 +没在茫茫信海中。 + +你不应该做的事情: + - 期望自己的补丁不受任何质疑就直接被接受 + - 翻脸 + - 忽略别人的评论 + - 没有按照别人的要求做任何修改就重新提交 + +在一个努力追寻最好技术方案的社区里,对于一个补丁有多少好处总会有不同的见 +解。你必须要抱着合作的态度,愿意改变自己的观点来适应内核的风格。或者至少 +愿意去证明你的想法是有价值的。记住,犯错误是允许的,只要你愿意朝着正确的 +方案去努力。 + +如果你的第一个补丁换来的是一堆修改建议,这是很正常的。这并不代表你的补丁 +不会被接受,也不意味着有人和你作对。你只需要改正所有提出的问题然后重新发 +送你的补丁。 + +内核社区和公司文化的差异 +------------------------ + +内核社区的工作模式同大多数传统公司开发队伍的工作模式并不相同。下面这些例 +子,可以帮助你避免某些可能发生问题: + 用这些话介绍你的修改提案会有好处: + - 它同时解决了多个问题 + - 它删除了2000行代码 + - 这是补丁,它已经解释了我想要说明的 + - 我在5种不同的体系结构上测试过它…… + - 这是一系列小补丁用来…… + - 这个修改提高了普通机器的性能…… + + 应该避免如下的说法: + - 我们在AIX/ptx/Solaris就是这么做的,所以这么做肯定是好的…… + - 我做这行已经20年了,所以…… + - 为了我们公司赚钱考虑必须这么做 + - 这是我们的企业产品线所需要的 + - 这里是描述我观点的1000页设计文档 + - 这是一个5000行的补丁用来…… + - 我重写了现在乱七八糟的代码,这就是…… + - 我被规定了最后期限,所以这个补丁需要立刻被接受 + +另外一个内核社区与大部分传统公司的软件开发队伍不同的地方是无法面对面地交 +流。使用电子邮件和IRC聊天工具做为主要沟通工具的一个好处是性别和种族歧视 +将会更少。Linux内核的工作环境更能接受妇女和少数族群,因为每个人在别人眼 +里只是一个邮件地址。国际化也帮助了公平的实现,因为你无法通过姓名来判断人 +的性别。男人有可能叫李丽,女人也有可能叫王刚。大多数在Linux内核上工作过 +并表达过看法的女性对在linux上工作的经历都给出了正面的评价。 + +对于一些不习惯使用英语的人来说,语言可能是一个引起问题的障碍。在邮件列表 +中要正确地表达想法必需良好地掌握语言,所以建议你在发送邮件之前最好检查一 +下英文写得是否正确。 + + +拆分修改 +-------- + +Linux内核社区并不喜欢一下接收大段的代码。修改需要被恰当地介绍、讨论并且 +拆分成独立的小段。这几乎完全和公司中的习惯背道而驰。你的想法应该在开发最 +开始的阶段就让大家知道,这样你就可以及时获得对你正在进行的开发的反馈。这 +样也会让社区觉得你是在和他们协作,而不是仅仅把他们当作倾销新功能的对象。 +无论如何,你不要一次性地向邮件列表发送50封信,你的补丁序列应该永远用不到 +这么多。 + +将补丁拆开的原因如下: + +1) 小的补丁更有可能被接受,因为它们不需要太多的时间和精力去验证其正确性。 + 一个5行的补丁,可能在维护者看了一眼以后就会被接受。而500行的补丁则 + 需要数个小时来审查其正确性(所需时间随补丁大小增加大约呈指数级增长)。 + + 当出了问题的时候,小的补丁也会让调试变得非常容易。一个一个补丁地回溯 + 将会比仔细剖析一个被打上的大补丁(这个补丁破坏了其他东西)容易得多。 + +2)不光发送小的补丁很重要,在提交之前重新编排、化简(或者仅仅重新排列) + 补丁也是很重要的。 + +这里有内核开发者Al Viro打的一个比方: + “想象一个老师正在给学生批改数学作业。老师并不希望看到学生为了得 + 到正确解法所进行的尝试和产生的错误。他希望看到的是最干净最优雅的 + 解答。好学生了解这点,绝不会把最终解决之前的中间方案提交上去。” + + 内核开发也是这样。维护者和评审者不希望看到一个人在解决问题时的思 + 考过程。他们只希望看到简单和优雅的解决方案。 + +直接给出一流的解决方案,和社区一起协作讨论尚未完成的工作,这两者之间似乎 +很难找到一个平衡点。所以最好尽早开始收集有利于你进行改进的反馈;同时也要 +保证修改分成很多小块,这样在整个项目都准备好被包含进内核之前,其中的一部 +分可能会先被接收。 + +必须了解这样做是不可接受的:试图将未完成的工作提交进内核,然后再找时间修 +复。 + + +证明修改的必要性 +---------------- +除了将补丁拆成小块,很重要的一点是让Linux社区了解他们为什么需要这样修改。 +你必须证明新功能是有人需要的并且是有用的。 + + +记录修改 +-------- + +当你发送补丁的时候,需要特别留意邮件正文的内容。因为这里的信息将会做为补 +丁的修改记录(ChangeLog),会被一直保留以备大家查阅。它需要完全地描述补丁, +包括: + - 为什么需要这个修改 + - 补丁的总体设计 + - 实现细节 + - 测试结果 + +想了解它具体应该看起来像什么,请查阅以下文档中的“ChangeLog”章节: + “The Perfect Patch” + http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt + + +这些事情有时候做起来很难。要在任何方面都做到完美可能需要好几年时间。这是 +一个持续提高的过程,它需要大量的耐心和决心。只要不放弃,你一定可以做到。 +很多人已经做到了,而他们都曾经和现在的你站在同样的起点上。 + + +--------------- +感谢Paolo Ciarrocchi允许“开发流程”部分基于他所写的文章 +(http://linux.tar.bz/articles/2.6-development_process),感谢Randy +Dunlap和Gerrit Huizenga完善了应该说和不该说的列表。感谢Pat Mochel, Hanna +Linder, Randy Dunlap, Kay Sievers, Vojtech Pavlik, Jan Kara, Josh Boyer, +Kees Cook, Andrew Morton, Andi Kleen, Vadim Lobanov, Jesper Juhl, Adrian +Bunk, Keri Harris, Frans Pop, David A. Wheeler, Junio Hamano, Michael +Kerrisk和Alex Shepard的评审、建议和贡献。没有他们的帮助,这篇文档是不可 +能完成的。 + + + +英文版维护者: Greg Kroah-Hartman <greg@kroah.com> diff --git a/Documentation/zh_CN/stable_api_nonsense.txt b/Documentation/zh_CN/stable_api_nonsense.txt new file mode 100644 index 000000000000..c26a27d1ee7d --- /dev/null +++ b/Documentation/zh_CN/stable_api_nonsense.txt @@ -0,0 +1,157 @@ +Chinese translated version of Documentation/stable_api_nonsense.txt + +If you have any comment or update to the content, please contact the +original document maintainer directly. However, if you have problem +communicating in English you can also ask the Chinese maintainer for help. +Contact the Chinese maintainer, if this translation is outdated or there +is problem with translation. + +Maintainer: Greg Kroah-Hartman <greg@kroah.com> +Chinese maintainer: TripleX Chung <zhongyu@18mail.cn> +--------------------------------------------------------------------- +Documentation/stable_api_nonsense.txt 的中文翻译 + +如果想评论或更新本文的内容,请直接联系原文档的维护者。如果你使用英文 +交流有困难的话,也可以向中文版维护者求助。如果本翻译更新不及时或者翻 +译存在问题,请联系中文版维护者。 + +英文版维护者: Greg Kroah-Hartman <greg@kroah.com> +中文版维护者: 钟宇 TripleX Chung <zhongyu@18mail.cn> +中文版翻译者: 钟宇 TripleX Chung <zhongyu@18mail.cn> +中文版校译者: 李阳 Li Yang <leoli@freescale.com> +以下为正文 +--------------------------------------------------------------------- + +写作本文档的目的,是为了解释为什么Linux既没有二进制内核接口,也没有稳定 +的内核接口。这里所说的内核接口,是指内核里的接口,而不是内核和用户空间 +的接口。内核到用户空间的接口,是提供给应用程序使用的系统调用,系统调用 +在历史上几乎没有过变化,将来也不会有变化。我有一些老应用程序是在0.9版本 +或者更早版本的内核上编译的,在使用2.6版本内核的Linux发布上依然用得很好 +。用户和应用程序作者可以将这个接口看成是稳定的。 + + +执行纲要 +-------- + +你也许以为自己想要稳定的内核接口,但是你不清楚你要的实际上不是它。你需 +要的其实是稳定的驱动程序,而你只有将驱动程序放到公版内核的源代码树里, +才有可能达到这个目的。而且这样做还有很多其它好处,正是因为这些好处使得 +Linux能成为强壮,稳定,成熟的操作系统,这也是你最开始选择Linux的原因。 + + +入门 +----- + +只有那些写驱动程序的“怪人”才会担心内核接口的改变,对广大用户来说,既 +看不到内核接口,也不需要去关心它。 + +首先,我不打算讨论关于任何非GPL许可的内核驱动的法律问题,这些非GPL许可 +的驱动程序包括不公开源代码,隐藏源代码,二进制或者是用源代码包装,或者 +是其它任何形式的不能以GPL许可公开源代码的驱动程序。如果有法律问题,请咨 +询律师,我只是一个程序员,所以我只打算探讨技术问题(不是小看法律问题, +法律问题很实际,并且需要一直关注)。 + +既然只谈技术问题,我们就有了下面两个主题:二进制内核接口和稳定的内核源 +代码接口。这两个问题是互相关联的,让我们先解决掉二进制接口的问题。 + + +二进制内核接口 +-------------- +假如我们有一个稳定的内核源代码接口,那么自然而然的,我们就拥有了稳定的 +二进制接口,是这样的吗?错。让我们看看关于Linux内核的几点事实: + - 取决于所用的C编译器的版本,不同的内核数据结构里的结构体的对齐方 +式会有差别,代码中不同函数的表现形式也不一样(函数是不是被inline编译取 +决于编译器行为)。不同的函数的表现形式并不重要,但是数据结构内部的对齐 +方式很关键。 + - 取决于内核的配置选项,不同的选项会让内核的很多东西发生改变: + - 同一个结构体可能包含不同的成员变量 + - 有的函数可能根本不会被实现(比如编译的时候没有选择SMP支持 +,一些锁函数就会被定义成空函数)。 + - 内核使用的内存会以不同的方式对齐,这取决于不同的内核配置选 +项。 + - Linux可以在很多的不同体系结构的处理器上运行。在某个体系结构上编 +译好的二进制驱动程序,不可能在另外一个体系结构上正确的运行。 + +对于一个特定的内核,满足这些条件并不难,使用同一个C编译器和同样的内核配 +置选项来编译驱动程序模块就可以了。这对于给一个特定Linux发布的特定版本提 +供驱动程序,是完全可以满足需求的。但是如果你要给不同发布的不同版本都发 +布一个驱动程序,就需要在每个发布上用不同的内核设置参数都编译一次内核, +这简直跟噩梦一样。而且还要注意到,每个Linux发布还提供不同的Linux内核, +这些内核都针对不同的硬件类型进行了优化(有很多种不同的处理器,还有不同 +的内核设置选项)。所以每发布一次驱动程序,都需要提供很多不同版本的内核 +模块。 + +相信我,如果你真的要采取这种发布方式,一定会慢慢疯掉,我很久以前就有过 +深刻的教训... + + +稳定的内核源代码接口 +-------------------- + +如果有人不将他的内核驱动程序,放入公版内核的源代码树,而又想让驱动程序 +一直保持在最新的内核中可用,那么这个话题将会变得没完没了。 + 内核开发是持续而且快节奏的,从来都不会慢下来。内核开发人员在当前接口中 +找到bug,或者找到更好的实现方式。一旦发现这些,他们就很快会去修改当前的 +接口。修改接口意味着,函数名可能会改变,结构体可能被扩充或者删减,函数 +的参数也可能发生改变。一旦接口被修改,内核中使用这些接口的地方需要同时 +修正,这样才能保证所有的东西继续工作。 + +举一个例子,内核的USB驱动程序接口在USB子系统的整个生命周期中,至少经历 +了三次重写。这些重写解决以下问题: + - 把数据流从同步模式改成非同步模式,这个改动减少了一些驱动程序的 +复杂度,提高了所有USB驱动程序的吞吐率,这样几乎所有的USB设备都能以最大 +速率工作了。 + - 修改了USB核心代码中为USB驱动分配数据包内存的方式,所有的驱动都 +需要提供更多的参数给USB核心,以修正了很多已经被记录在案的死锁。 + +这和一些封闭源代码的操作系统形成鲜明的对比,在那些操作系统上,不得不额 +外的维护旧的USB接口。这导致了一个可能性,新的开发者依然会不小心使用旧的 +接口,以不恰当的方式编写代码,进而影响到操作系统的稳定性。 + 在上面的例子中,所有的开发者都同意这些重要的改动,在这样的情况下修改代 +价很低。如果Linux保持一个稳定的内核源代码接口,那么就得创建一个新的接口 +;旧的,有问题的接口必须一直维护,给Linux USB开发者带来额外的工作。既然 +所有的Linux USB驱动的作者都是利用自己的时间工作,那么要求他们去做毫无意 +义的免费额外工作,是不可能的。 + 安全问题对Linux来说十分重要。一个安全问题被发现,就会在短时间内得到修 +正。在很多情况下,这将导致Linux内核中的一些接口被重写,以从根本上避免安 +全问题。一旦接口被重写,所有使用这些接口的驱动程序,必须同时得到修正, +以确定安全问题已经得到修复并且不可能在未来还有同样的安全问题。如果内核 +内部接口不允许改变,那么就不可能修复这样的安全问题,也不可能确认这样的 +安全问题以后不会发生。 +开发者一直在清理内核接口。如果一个接口没有人在使用了,它就会被删除。这 +样可以确保内核尽可能的小,而且所有潜在的接口都会得到尽可能完整的测试 +(没有人使用的接口是不可能得到良好的测试的)。 + + +要做什么 +------- + +如果你写了一个Linux内核驱动,但是它还不在Linux源代码树里,作为一个开发 +者,你应该怎么做?为每个发布的每个版本提供一个二进制驱动,那简直是一个 +噩梦,要跟上永远处于变化之中的内核接口,也是一件辛苦活。 +很简单,让你的驱动进入内核源代码树(要记得我们在谈论的是以GPL许可发行 +的驱动,如果你的代码不符合GPL,那么祝你好运,你只能自己解决这个问题了, +你这个吸血鬼<把Andrew和Linus对吸血鬼的定义链接到这里>)。当你的代码加入 +公版内核源代码树之后,如果一个内核接口改变,你的驱动会直接被修改接口的 +那个人修改。保证你的驱动永远都可以编译通过,并且一直工作,你几乎不需要 +做什么事情。 + +把驱动放到内核源代码树里会有很多的好处: + - 驱动的质量会提升,而维护成本(对原始作者来说)会下降。 + - 其他人会给驱动添加新特性。 + - 其他人会找到驱动中的bug并修复。 + - 其他人会在驱动中找到性能优化的机会。 + - 当外部的接口的改变需要修改驱动程序的时候,其他人会修改驱动程序 +。 + - 不需要联系任何发行商,这个驱动会自动的随着所有的Linux发布一起发 +布。 + +和别的操作系统相比,Linux为更多不同的设备提供现成的驱动,而且能在更多不 +同体系结构的处理器上支持这些设备。这个经过考验的开发模式,必然是错不了 +的 :) + +------------- +感谢 Randy Dunlap, Andrew Morton, David Brownell, Hanna Linder, +Robert Love, and Nishanth Aravamudan 对于本文档早期版本的评审和建议。 + +英文版维护者: Greg Kroah-Hartman <greg@kroah.com> |