From f59c55d04b43bd72df8efa692dd07224fe94d1ac Mon Sep 17 00:00:00 2001 From: Huang Ying Date: Tue, 7 Dec 2010 10:22:30 +0800 Subject: ACPI, APEI, Add APEI generic error status printing support In APEI, Hardware error information reported by firmware to Linux kernel is in the data structure of APEI generic error status (struct acpi_hes_generic_status). While now printk is used by Linux kernel to report hardware error information to user space. So, this patch adds printing support for the data structure, so that the corresponding hardware error information can be reported to user space via printk. PCIe AER information printing is not implemented yet. Will refactor the original PCIe AER information printing code to avoid code duplicating. The output format is as follow: := APEI generic hardware error status severity: , section: , severity: , flags:
fru_id: fru_text: section_type:
* := recoverable | fatal | corrected | info
# := [primary][, containment warning][, reset][, threshold exceeded]\ [, resource not accessible][, latent error]
:= generic processor error | memory error | \ PCIe error | unknown,
:= | | \ | := [processor_type: , ] [processor_isa: , ] [error_type: ] [operation: , ] [flags: ] [level: ] [version_info: ] [processor_id: ] [target_address: ] [requestor_id: ] [responder_id: ] [IP: ] * := IA32/X64 | IA64 * := IA32 | IA64 | X64 # := [cache error][, TLB error][, bus error][, micro-architectural error] * := unknown or generic | data read | data write | \ instruction execution # := [restartable][, precise IP][, overflow][, corrected] := [error_status: ] [physical_address: ] [physical_address_mask: ] [node: ] [card: ] [module: ] [bank: ] [device: ] [row: ] [column: ] [bit_position: ] [requestor_id: ] [responder_id: ] [target_id: ] [error_type: , ] * := unknown | no error | single-bit ECC | multi-bit ECC | \ single-symbol chipkill ECC | multi-symbol chipkill ECC | master abort | \ target abort | parity error | watchdog timeout | invalid address | \ mirror Broken | memory sparing | scrub corrected error | \ scrub uncorrected error := [port_type: , ] [version: .] [command: , status: ] [device_id: ::. slot: secondary_bus: vendor_id: , device_id: class_code: ] [serial number: , ] [bridge: secondary_status: , control: ] * := PCIe end point | legacy PCI end point | \ unknown | unknown | root port | upstream switch port | \ downstream switch port | PCIe to PCI/PCI-X bridge | \ PCI/PCI-X to PCIe bridge | root complex integrated endpoint device | \ root complex event collector Where, [] designate corresponding content is optional All description with * has the following format: field: , Where value of should be the position of "string" in description. Otherwise, will be "unknown". All description with # has the following format: field: Where each string in corresponding to one set bit of . The bit position is the position of "string" in description. For more detailed explanation of every field, please refer to UEFI specification version 2.3 or later, section Appendix N: Common Platform Error Record. Signed-off-by: Huang Ying Signed-off-by: Len Brown --- Documentation/acpi/apei/output_format.txt | 122 ++++++++++++++++++++++++++++++ 1 file changed, 122 insertions(+) create mode 100644 Documentation/acpi/apei/output_format.txt (limited to 'Documentation') diff --git a/Documentation/acpi/apei/output_format.txt b/Documentation/acpi/apei/output_format.txt new file mode 100644 index 000000000000..9146952c612a --- /dev/null +++ b/Documentation/acpi/apei/output_format.txt @@ -0,0 +1,122 @@ + APEI output format + ~~~~~~~~~~~~~~~~~~ + +APEI uses printk as hardware error reporting interface, the output +format is as follow. + + := +APEI generic hardware error status +severity: , +section: , severity: , +flags: +
+fru_id: +fru_text: +section_type:
+
+ +* := recoverable | fatal | corrected | info + +
# := +[primary][, containment warning][, reset][, threshold exceeded]\ +[, resource not accessible][, latent error] + +
:= generic processor error | memory error | \ +PCIe error | unknown, + +
:= + | | \ + | + + := +[processor_type: , ] +[processor_isa: , ] +[error_type: +] +[operation: , ] +[flags: +] +[level: ] +[version_info: ] +[processor_id: ] +[target_address: ] +[requestor_id: ] +[responder_id: ] +[IP: ] + +* := IA32/X64 | IA64 + +* := IA32 | IA64 | X64 + +# := +[cache error][, TLB error][, bus error][, micro-architectural error] + +* := unknown or generic | data read | data write | \ +instruction execution + +# := +[restartable][, precise IP][, overflow][, corrected] + + := +[error_status: ] +[physical_address: ] +[physical_address_mask: ] +[node: ] +[card: ] +[module: ] +[bank: ] +[device: ] +[row: ] +[column: ] +[bit_position: ] +[requestor_id: ] +[responder_id: ] +[target_id: ] +[error_type: , ] + +* := +unknown | no error | single-bit ECC | multi-bit ECC | \ +single-symbol chipkill ECC | multi-symbol chipkill ECC | master abort | \ +target abort | parity error | watchdog timeout | invalid address | \ +mirror Broken | memory sparing | scrub corrected error | \ +scrub uncorrected error + + := +[port_type: , ] +[version: .] +[command: , status: ] +[device_id: ::. +slot: +secondary_bus: +vendor_id: , device_id: +class_code: ] +[serial number: , ] +[bridge: secondary_status: , control: ] + +* := PCIe end point | legacy PCI end point | \ +unknown | unknown | root port | upstream switch port | \ +downstream switch port | PCIe to PCI/PCI-X bridge | \ +PCI/PCI-X to PCIe bridge | root complex integrated endpoint device | \ +root complex event collector + +Where, [] designate corresponding content is optional + +All description with * has the following format: + +field: , + +Where value of should be the position of "string" in description. Otherwise, will be "unknown". + +All description with # has the following format: + +field: + + +Where each string in corresponding to one set bit of +. The bit position is the position of "string" in description. + +For more detailed explanation of every field, please refer to UEFI +specification version 2.3 or later, section Appendix N: Common +Platform Error Record. -- cgit v1.2.3 From 677bd810eedce61edf15452491781ff046b92edc Mon Sep 17 00:00:00 2001 From: Zhang Rui Date: Mon, 6 Dec 2010 15:04:21 +0800 Subject: ACPI video: remove output switching control Remove the ACPI video output switching control as it never works. With the patch applied, ACPI video driver still catches the video output notification, but it does nothing but raises the notification to userspace. Signed-off-by: Zhang Rui Signed-off-by: Len Brown --- Documentation/kernel-parameters.txt | 5 --- drivers/acpi/video.c | 86 ------------------------------------- drivers/acpi/video_detect.c | 57 ++---------------------- drivers/gpu/drm/Kconfig | 1 - drivers/gpu/stub/Kconfig | 1 - 5 files changed, 4 insertions(+), 146 deletions(-) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index cdd2a6e8a3b7..f91fc5a65034 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -199,11 +199,6 @@ and is between 256 and 4096 characters. It is defined in the file unusable. The "log_buf_len" parameter may be useful if you need to capture more output. - acpi_display_output= [HW,ACPI] - acpi_display_output=vendor - acpi_display_output=video - See above. - acpi_irq_balance [HW,ACPI] ACPI will balance active IRQs default in APIC mode diff --git a/drivers/acpi/video.c b/drivers/acpi/video.c index 5cd0228d2daa..81766ebec843 100644 --- a/drivers/acpi/video.c +++ b/drivers/acpi/video.c @@ -33,7 +33,6 @@ #include #include #include -#include #include #include #include @@ -172,9 +171,6 @@ struct acpi_video_device_cap { u8 _BQC:1; /* Get current brightness level */ u8 _BCQ:1; /* Some buggy BIOS uses _BCQ instead of _BQC */ u8 _DDC:1; /*Return the EDID for this device */ - u8 _DCS:1; /*Return status of output device */ - u8 _DGS:1; /*Query graphics state */ - u8 _DSS:1; /*Device state set */ }; struct acpi_video_brightness_flags { @@ -202,7 +198,6 @@ struct acpi_video_device { struct acpi_video_device_brightness *brightness; struct backlight_device *backlight; struct thermal_cooling_device *cooling_dev; - struct output_device *output_dev; }; static const char device_decode[][30] = { @@ -226,10 +221,6 @@ static int acpi_video_get_next_level(struct acpi_video_device *device, u32 level_current, u32 event); static int acpi_video_switch_brightness(struct acpi_video_device *device, int event); -static int acpi_video_device_get_state(struct acpi_video_device *device, - unsigned long long *state); -static int acpi_video_output_get(struct output_device *od); -static int acpi_video_device_set_state(struct acpi_video_device *device, int state); /*backlight device sysfs support*/ static int acpi_video_get_brightness(struct backlight_device *bd) @@ -265,30 +256,6 @@ static struct backlight_ops acpi_backlight_ops = { .update_status = acpi_video_set_brightness, }; -/*video output device sysfs support*/ -static int acpi_video_output_get(struct output_device *od) -{ - unsigned long long state; - struct acpi_video_device *vd = - (struct acpi_video_device *)dev_get_drvdata(&od->dev); - acpi_video_device_get_state(vd, &state); - return (int)state; -} - -static int acpi_video_output_set(struct output_device *od) -{ - unsigned long state = od->request_state; - struct acpi_video_device *vd= - (struct acpi_video_device *)dev_get_drvdata(&od->dev); - return acpi_video_device_set_state(vd, state); -} - -static struct output_properties acpi_output_properties = { - .set_state = acpi_video_output_set, - .get_status = acpi_video_output_get, -}; - - /* thermal cooling device callbacks */ static int video_get_max_state(struct thermal_cooling_device *cooling_dev, unsigned long *state) @@ -344,34 +311,6 @@ static struct thermal_cooling_device_ops video_cooling_ops = { Video Management -------------------------------------------------------------------------- */ -/* device */ - -static int -acpi_video_device_get_state(struct acpi_video_device *device, - unsigned long long *state) -{ - int status; - - status = acpi_evaluate_integer(device->dev->handle, "_DCS", NULL, state); - - return status; -} - -static int -acpi_video_device_set_state(struct acpi_video_device *device, int state) -{ - int status; - union acpi_object arg0 = { ACPI_TYPE_INTEGER }; - struct acpi_object_list args = { 1, &arg0 }; - unsigned long long ret; - - - arg0.integer.value = state; - status = acpi_evaluate_integer(device->dev->handle, "_DSS", &args, &ret); - - return status; -} - static int acpi_video_device_lcd_query_levels(struct acpi_video_device *device, union acpi_object **levels) @@ -831,15 +770,6 @@ static void acpi_video_device_find_cap(struct acpi_video_device *device) if (ACPI_SUCCESS(acpi_get_handle(device->dev->handle, "_DDC", &h_dummy1))) { device->cap._DDC = 1; } - if (ACPI_SUCCESS(acpi_get_handle(device->dev->handle, "_DCS", &h_dummy1))) { - device->cap._DCS = 1; - } - if (ACPI_SUCCESS(acpi_get_handle(device->dev->handle, "_DGS", &h_dummy1))) { - device->cap._DGS = 1; - } - if (ACPI_SUCCESS(acpi_get_handle(device->dev->handle, "_DSS", &h_dummy1))) { - device->cap._DSS = 1; - } if (acpi_video_backlight_support()) { struct backlight_properties props; @@ -904,21 +834,6 @@ static void acpi_video_device_find_cap(struct acpi_video_device *device) printk(KERN_ERR PREFIX "Create sysfs link\n"); } - - if (acpi_video_display_switch_support()) { - - if (device->cap._DCS && device->cap._DSS) { - static int count; - char *name; - name = kasprintf(GFP_KERNEL, "acpi_video%d", count); - if (!name) - return; - count++; - device->output_dev = video_output_register(name, - NULL, device, &acpi_output_properties); - kfree(name); - } - } } /* @@ -1452,7 +1367,6 @@ static int acpi_video_bus_put_one_device(struct acpi_video_device *device) thermal_cooling_device_unregister(device->cooling_dev); device->cooling_dev = NULL; } - video_output_unregister(device->output_dev); return 0; } diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c index b83676126598..42d3d72dae85 100644 --- a/drivers/acpi/video_detect.c +++ b/drivers/acpi/video_detect.c @@ -17,15 +17,14 @@ * capabilities the graphics cards plugged in support. The check for general * video capabilities will be triggered by the first caller of * acpi_video_get_capabilities(NULL); which will happen when the first - * backlight (or display output) switching supporting driver calls: + * backlight switching supporting driver calls: * acpi_video_backlight_support(); * * Depending on whether ACPI graphics extensions (cmp. ACPI spec Appendix B) * are available, video.ko should be used to handle the device. * * Otherwise vendor specific drivers like thinkpad_acpi, asus_acpi, - * sony_acpi,... can take care about backlight brightness and display output - * switching. + * sony_acpi,... can take care about backlight brightness. * * If CONFIG_ACPI_VIDEO is neither set as "compiled in" (y) nor as a module (m) * this file will not be compiled, acpi_video_get_capabilities() and @@ -83,11 +82,6 @@ long acpi_is_video_device(struct acpi_device *device) if (!device) return 0; - /* Is this device able to support video switching ? */ - if (ACPI_SUCCESS(acpi_get_handle(device->handle, "_DOD", &h_dummy)) || - ACPI_SUCCESS(acpi_get_handle(device->handle, "_DOS", &h_dummy))) - video_caps |= ACPI_VIDEO_OUTPUT_SWITCHING; - /* Is this device able to retrieve a video ROM ? */ if (ACPI_SUCCESS(acpi_get_handle(device->handle, "_ROM", &h_dummy))) video_caps |= ACPI_VIDEO_ROM_AVAILABLE; @@ -161,8 +155,6 @@ long acpi_video_get_capabilities(acpi_handle graphics_handle) * * if (dmi_name_in_vendors("XY")) { * acpi_video_support |= - * ACPI_VIDEO_OUTPUT_SWITCHING_DMI_VENDOR; - * acpi_video_support |= * ACPI_VIDEO_BACKLIGHT_DMI_VENDOR; *} */ @@ -212,33 +204,8 @@ int acpi_video_backlight_support(void) EXPORT_SYMBOL(acpi_video_backlight_support); /* - * Returns true if video.ko can do display output switching. - * This does not work well/at all with binary graphics drivers - * which disable system io ranges and do it on their own. - */ -int acpi_video_display_switch_support(void) -{ - if (!acpi_video_caps_checked) - acpi_video_get_capabilities(NULL); - - if (acpi_video_support & ACPI_VIDEO_OUTPUT_SWITCHING_FORCE_VENDOR) - return 0; - else if (acpi_video_support & ACPI_VIDEO_OUTPUT_SWITCHING_FORCE_VIDEO) - return 1; - - if (acpi_video_support & ACPI_VIDEO_OUTPUT_SWITCHING_DMI_VENDOR) - return 0; - else if (acpi_video_support & ACPI_VIDEO_OUTPUT_SWITCHING_DMI_VIDEO) - return 1; - - return acpi_video_support & ACPI_VIDEO_OUTPUT_SWITCHING; -} -EXPORT_SYMBOL(acpi_video_display_switch_support); - -/* - * Use acpi_display_output=vendor/video or acpi_backlight=vendor/video - * To force that backlight or display output switching is processed by vendor - * specific acpi drivers or video.ko driver. + * Use acpi_backlight=vendor/video to force that backlight switching + * is processed by vendor specific acpi drivers or video.ko driver. */ static int __init acpi_backlight(char *str) { @@ -255,19 +222,3 @@ static int __init acpi_backlight(char *str) return 1; } __setup("acpi_backlight=", acpi_backlight); - -static int __init acpi_display_output(char *str) -{ - if (str == NULL || *str == '\0') - return 1; - else { - if (!strcmp("vendor", str)) - acpi_video_support |= - ACPI_VIDEO_OUTPUT_SWITCHING_FORCE_VENDOR; - if (!strcmp("video", str)) - acpi_video_support |= - ACPI_VIDEO_OUTPUT_SWITCHING_FORCE_VIDEO; - } - return 1; -} -__setup("acpi_display_output=", acpi_display_output); diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index 7af443672626..64828a7db77b 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -107,7 +107,6 @@ config DRM_I915 select FB_CFB_IMAGEBLIT # i915 depends on ACPI_VIDEO when ACPI is enabled # but for select to work, need to select ACPI_VIDEO's dependencies, ick - select VIDEO_OUTPUT_CONTROL if ACPI select BACKLIGHT_CLASS_DEVICE if ACPI select INPUT if ACPI select ACPI_VIDEO if ACPI diff --git a/drivers/gpu/stub/Kconfig b/drivers/gpu/stub/Kconfig index 0e1edd7311ff..09aea5f1556d 100644 --- a/drivers/gpu/stub/Kconfig +++ b/drivers/gpu/stub/Kconfig @@ -3,7 +3,6 @@ config STUB_POULSBO depends on PCI # Poulsbo stub depends on ACPI_VIDEO when ACPI is enabled # but for select to work, need to select ACPI_VIDEO's dependencies, ick - select VIDEO_OUTPUT_CONTROL if ACPI select BACKLIGHT_CLASS_DEVICE if ACPI select INPUT if ACPI select ACPI_VIDEO if ACPI -- cgit v1.2.3 From 37bf501bdda1d5d6ea73ce29d4b00d291b6f3811 Mon Sep 17 00:00:00 2001 From: Zhao Yakui Date: Wed, 8 Dec 2010 10:10:17 +0800 Subject: IPMI: Add the document description of ipmi_get_smi_info Add the document description about how to use ipmi_get_smi_info. Signed-off-by: Zhao Yakui Signed-off-by: Corey Minyard Signed-off-by: Len Brown --- Documentation/IPMI.txt | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) (limited to 'Documentation') diff --git a/Documentation/IPMI.txt b/Documentation/IPMI.txt index 69dd29ed824e..b2bea15137d2 100644 --- a/Documentation/IPMI.txt +++ b/Documentation/IPMI.txt @@ -533,6 +533,33 @@ completion during sending a panic event. Other Pieces ------------ +Get the detailed info related with the IPMI device +-------------------------------------------------- + +Some users need more detailed information about a device, like where +the address came from or the raw base device for the IPMI interface. +You can use the IPMI smi_watcher to catch the IPMI interfaces as they +come or go, and to grab the information, you can use the function +ipmi_get_smi_info(), which returns the following structure: + +struct ipmi_smi_info { + enum ipmi_addr_src addr_src; + struct device *dev; + union { + struct { + void *acpi_handle; + } acpi_info; + } addr_info; +}; + +Currently special info for only for SI_ACPI address sources is +returned. Others may be added as necessary. + +Note that the dev pointer is included in the above structure, and +assuming ipmi_smi_get_info returns success, you must call put_device +on the dev pointer. + + Watchdog -------- -- cgit v1.2.3 From e63eb9375089f9d2041305d04c3f33a194e0e014 Mon Sep 17 00:00:00 2001 From: "J. Bruce Fields" Date: Sat, 30 Oct 2010 17:41:26 -0400 Subject: nfsd4: eliminate lease delete callback nfsd controls the lifetime of the lease, not the lock code, so there's no need for this callback on lease destruction. Signed-off-by: J. Bruce Fields --- Documentation/filesystems/Locking | 2 -- fs/nfsd/nfs4state.c | 18 ------------------ 2 files changed, 20 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index b6426f15b4ae..075be12bc531 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -327,14 +327,12 @@ fl_release_private: yes yes prototypes: int (*fl_compare_owner)(struct file_lock *, struct file_lock *); void (*fl_notify)(struct file_lock *); /* unblock callback */ - void (*fl_release_private)(struct file_lock *); void (*fl_break)(struct file_lock *); /* break_lease callback */ locking rules: BKL may block fl_compare_owner: yes no fl_notify: yes no -fl_release_private: yes yes fl_break: yes no Currently only NFSD and NLM provide instances of this class. None of the diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index b82e36862044..2e44ad2539ab 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -2295,23 +2295,6 @@ void nfsd_break_deleg_cb(struct file_lock *fl) nfsd4_cb_recall(dp); } -/* - * The file_lock is being reapd. - * - * Called by locks_free_lock() with lock_flocks() held. - */ -static -void nfsd_release_deleg_cb(struct file_lock *fl) -{ - struct nfs4_delegation *dp = (struct nfs4_delegation *)fl->fl_owner; - - dprintk("NFSD nfsd_release_deleg_cb: fl %p dp %p dl_count %d\n", fl,dp, atomic_read(&dp->dl_count)); - - if (!(fl->fl_flags & FL_LEASE) || !dp) - return; - dp->dl_flock = NULL; -} - /* * Called from setlease() with lock_flocks() held */ @@ -2341,7 +2324,6 @@ int nfsd_change_deleg_cb(struct file_lock **onlist, int arg) static const struct lock_manager_operations nfsd_lease_mng_ops = { .fl_break = nfsd_break_deleg_cb, - .fl_release_private = nfsd_release_deleg_cb, .fl_mylease = nfsd_same_client_deleg_cb, .fl_change = nfsd_change_deleg_cb, }; -- cgit v1.2.3 From 016134eee334d51262f10ce3261976ea40a57878 Mon Sep 17 00:00:00 2001 From: Johannes Berg Date: Thu, 6 Jan 2011 22:36:47 +0100 Subject: mac80211: add doc short section on LED triggers Just create a section to collect the LED trigger functions and add a very short description as to what drivers should do. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville --- Documentation/DocBook/80211.tmpl | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/DocBook/80211.tmpl b/Documentation/DocBook/80211.tmpl index 03641a08e275..8906648f962b 100644 --- a/Documentation/DocBook/80211.tmpl +++ b/Documentation/DocBook/80211.tmpl @@ -268,10 +268,6 @@ !Finclude/net/mac80211.h ieee80211_ops !Finclude/net/mac80211.h ieee80211_alloc_hw !Finclude/net/mac80211.h ieee80211_register_hw -!Finclude/net/mac80211.h ieee80211_get_tx_led_name -!Finclude/net/mac80211.h ieee80211_get_rx_led_name -!Finclude/net/mac80211.h ieee80211_get_assoc_led_name -!Finclude/net/mac80211.h ieee80211_get_radio_led_name !Finclude/net/mac80211.h ieee80211_unregister_hw !Finclude/net/mac80211.h ieee80211_free_hw @@ -382,6 +378,23 @@ + + LED support + + Mac80211 supports various ways of blinking LEDs. Wherever possible, + device LEDs should be exposed as LED class devices and hooked up to + the appropriate trigger, which will then be triggered appropriately + by mac80211. + +!Finclude/net/mac80211.h ieee80211_get_tx_led_name +!Finclude/net/mac80211.h ieee80211_get_rx_led_name +!Finclude/net/mac80211.h ieee80211_get_assoc_led_name +!Finclude/net/mac80211.h ieee80211_get_radio_led_name +!Finclude/net/mac80211.h ieee80211_tpt_blink +!Finclude/net/mac80211.h ieee80211_tpt_led_trigger_flags +!Finclude/net/mac80211.h ieee80211_create_tpt_led_trigger + + Hardware crypto acceleration !Pinclude/net/mac80211.h Hardware crypto acceleration -- cgit v1.2.3 From 8c3df3e58cde676de4df9363c00f37dbfce08e5c Mon Sep 17 00:00:00 2001 From: "J. Bruce Fields" Date: Tue, 11 Jan 2011 10:12:37 -0500 Subject: Documentation: don't remove lock manager fl_release_private I thought I'd removed the last user of this, but missed fs/locks.c:lease_release_private_callback(). Thanks to Christoph for pointing this out. Signed-off-by: J. Bruce Fields --- Documentation/filesystems/Locking | 2 ++ 1 file changed, 2 insertions(+) (limited to 'Documentation') diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index 075be12bc531..b6426f15b4ae 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -327,12 +327,14 @@ fl_release_private: yes yes prototypes: int (*fl_compare_owner)(struct file_lock *, struct file_lock *); void (*fl_notify)(struct file_lock *); /* unblock callback */ + void (*fl_release_private)(struct file_lock *); void (*fl_break)(struct file_lock *); /* break_lease callback */ locking rules: BKL may block fl_compare_owner: yes no fl_notify: yes no +fl_release_private: yes yes fl_break: yes no Currently only NFSD and NLM provide instances of this class. None of the -- cgit v1.2.3 From ec26fba40fa92c7cc5c61d40746f499dcefc67be Mon Sep 17 00:00:00 2001 From: "J. Bruce Fields" Date: Tue, 11 Jan 2011 10:30:48 -0500 Subject: Documentation: fl_mylease no longer exists Signed-off-by: J. Bruce Fields --- Documentation/filesystems/Locking | 2 -- 1 file changed, 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index 33fa3e5d38fd..01d7cbdcd3bd 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -340,7 +340,6 @@ prototypes: int (*fl_grant)(struct file_lock *, struct file_lock *, int); void (*fl_release_private)(struct file_lock *); void (*fl_break)(struct file_lock *); /* break_lease callback */ - int (*fl_mylease)(struct file_lock *, struct file_lock *); int (*fl_change)(struct file_lock **, int); locking rules: @@ -350,7 +349,6 @@ fl_notify: yes no fl_grant: no no fl_release_private: maybe no fl_break: yes no -fl_mylease: yes no fl_change yes no --------------------------- buffer_head ----------------------------------- -- cgit v1.2.3 From 4cb18728709683c91a5f6f8d5f337bfb498b089a Mon Sep 17 00:00:00 2001 From: "R.Durgadoss" Date: Wed, 27 Oct 2010 03:33:29 +0530 Subject: thermal: Add event notification to thermal framework This patch adds event notification support to the generic thermal sysfs framework in the kernel. The notification is in the form of a netlink event. Signed-off-by: R.Durgadoss Signed-off-by: Len Brown --- Documentation/ABI/stable/thermal-notification | 4 + Documentation/thermal/sysfs-api.txt | 12 +++ drivers/thermal/Kconfig | 1 + drivers/thermal/thermal_sys.c | 103 +++++++++++++++++++++++++- include/linux/thermal.h | 32 ++++++++ 5 files changed, 151 insertions(+), 1 deletion(-) create mode 100644 Documentation/ABI/stable/thermal-notification (limited to 'Documentation') diff --git a/Documentation/ABI/stable/thermal-notification b/Documentation/ABI/stable/thermal-notification new file mode 100644 index 000000000000..9723e8b7aeb3 --- /dev/null +++ b/Documentation/ABI/stable/thermal-notification @@ -0,0 +1,4 @@ +What: A notification mechanism for thermal related events +Description: + This interface enables notification for thermal related events. + The notification is in the form of a netlink event. diff --git a/Documentation/thermal/sysfs-api.txt b/Documentation/thermal/sysfs-api.txt index cb3d15bc1aeb..b61e46f449aa 100644 --- a/Documentation/thermal/sysfs-api.txt +++ b/Documentation/thermal/sysfs-api.txt @@ -278,3 +278,15 @@ method, the sys I/F structure will be built like this: |---name: acpitz |---temp1_input: 37000 |---temp1_crit: 100000 + +4. Event Notification + +The framework includes a simple notification mechanism, in the form of a +netlink event. Netlink socket initialization is done during the _init_ +of the framework. Drivers which intend to use the notification mechanism +just need to call generate_netlink_event() with two arguments viz +(originator, event). Typically the originator will be an integer assigned +to a thermal_zone_device when it registers itself with the framework. The +event will be one of:{THERMAL_AUX0, THERMAL_AUX1, THERMAL_CRITICAL, +THERMAL_DEV_FAULT}. Notification can be sent when the current temperature +crosses any of the configured thresholds. diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig index bf7c687519ef..f7a5dba3ca23 100644 --- a/drivers/thermal/Kconfig +++ b/drivers/thermal/Kconfig @@ -4,6 +4,7 @@ menuconfig THERMAL tristate "Generic Thermal sysfs driver" + depends on NET help Generic Thermal Sysfs driver offers a generic mechanism for thermal management. Usually it's made up of one or more thermal diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c index 13c72c629329..760e045c93c8 100644 --- a/drivers/thermal/thermal_sys.c +++ b/drivers/thermal/thermal_sys.c @@ -32,6 +32,8 @@ #include #include #include +#include +#include MODULE_AUTHOR("Zhang Rui"); MODULE_DESCRIPTION("Generic thermal management sysfs support"); @@ -58,6 +60,22 @@ static LIST_HEAD(thermal_tz_list); static LIST_HEAD(thermal_cdev_list); static DEFINE_MUTEX(thermal_list_lock); +static unsigned int thermal_event_seqnum; + +static struct genl_family thermal_event_genl_family = { + .id = GENL_ID_GENERATE, + .name = THERMAL_GENL_FAMILY_NAME, + .version = THERMAL_GENL_VERSION, + .maxattr = THERMAL_GENL_ATTR_MAX, +}; + +static struct genl_multicast_group thermal_event_mcgrp = { + .name = THERMAL_GENL_MCAST_GROUP_NAME, +}; + +static int genetlink_init(void); +static void genetlink_exit(void); + static int get_idr(struct idr *idr, struct mutex *lock, int *id) { int err; @@ -1214,6 +1232,82 @@ void thermal_zone_device_unregister(struct thermal_zone_device *tz) EXPORT_SYMBOL(thermal_zone_device_unregister); +int generate_netlink_event(u32 orig, enum events event) +{ + struct sk_buff *skb; + struct nlattr *attr; + struct thermal_genl_event *thermal_event; + void *msg_header; + int size; + int result; + + /* allocate memory */ + size = nla_total_size(sizeof(struct thermal_genl_event)) + \ + nla_total_size(0); + + skb = genlmsg_new(size, GFP_ATOMIC); + if (!skb) + return -ENOMEM; + + /* add the genetlink message header */ + msg_header = genlmsg_put(skb, 0, thermal_event_seqnum++, + &thermal_event_genl_family, 0, + THERMAL_GENL_CMD_EVENT); + if (!msg_header) { + nlmsg_free(skb); + return -ENOMEM; + } + + /* fill the data */ + attr = nla_reserve(skb, THERMAL_GENL_ATTR_EVENT, \ + sizeof(struct thermal_genl_event)); + + if (!attr) { + nlmsg_free(skb); + return -EINVAL; + } + + thermal_event = nla_data(attr); + if (!thermal_event) { + nlmsg_free(skb); + return -EINVAL; + } + + memset(thermal_event, 0, sizeof(struct thermal_genl_event)); + + thermal_event->orig = orig; + thermal_event->event = event; + + /* send multicast genetlink message */ + result = genlmsg_end(skb, msg_header); + if (result < 0) { + nlmsg_free(skb); + return result; + } + + result = genlmsg_multicast(skb, 0, thermal_event_mcgrp.id, GFP_ATOMIC); + if (result) + printk(KERN_INFO "failed to send netlink event:%d", result); + + return result; +} +EXPORT_SYMBOL(generate_netlink_event); + +static int genetlink_init(void) +{ + int result; + + result = genl_register_family(&thermal_event_genl_family); + if (result) + return result; + + result = genl_register_mc_group(&thermal_event_genl_family, + &thermal_event_mcgrp); + if (result) + genl_unregister_family(&thermal_event_genl_family); + return result; +} + static int __init thermal_init(void) { int result = 0; @@ -1225,9 +1319,15 @@ static int __init thermal_init(void) mutex_destroy(&thermal_idr_lock); mutex_destroy(&thermal_list_lock); } + result = genetlink_init(); return result; } +static void genetlink_exit(void) +{ + genl_unregister_family(&thermal_event_genl_family); +} + static void __exit thermal_exit(void) { class_unregister(&thermal_class); @@ -1235,7 +1335,8 @@ static void __exit thermal_exit(void) idr_destroy(&thermal_cdev_idr); mutex_destroy(&thermal_idr_lock); mutex_destroy(&thermal_list_lock); + genetlink_exit(); } -subsys_initcall(thermal_init); +fs_initcall(thermal_init); module_exit(thermal_exit); diff --git a/include/linux/thermal.h b/include/linux/thermal.h index 1de8b9eb841b..784644961279 100644 --- a/include/linux/thermal.h +++ b/include/linux/thermal.h @@ -127,6 +127,37 @@ struct thermal_zone_device { struct thermal_hwmon_attr temp_crit; /* hwmon sys attr */ #endif }; +/* Adding event notification support elements */ +#define THERMAL_GENL_FAMILY_NAME "thermal_event" +#define THERMAL_GENL_VERSION 0x01 +#define THERMAL_GENL_MCAST_GROUP_NAME "thermal_mc_group" + +enum events { + THERMAL_AUX0, + THERMAL_AUX1, + THERMAL_CRITICAL, + THERMAL_DEV_FAULT, +}; + +struct thermal_genl_event { + u32 orig; + enum events event; +}; +/* attributes of thermal_genl_family */ +enum { + THERMAL_GENL_ATTR_UNSPEC, + THERMAL_GENL_ATTR_EVENT, + __THERMAL_GENL_ATTR_MAX, +}; +#define THERMAL_GENL_ATTR_MAX (__THERMAL_GENL_ATTR_MAX - 1) + +/* commands supported by the thermal_genl_family */ +enum { + THERMAL_GENL_CMD_UNSPEC, + THERMAL_GENL_CMD_EVENT, + __THERMAL_GENL_CMD_MAX, +}; +#define THERMAL_GENL_CMD_MAX (__THERMAL_GENL_CMD_MAX - 1) struct thermal_zone_device *thermal_zone_device_register(char *, int, void *, struct @@ -146,5 +177,6 @@ struct thermal_cooling_device *thermal_cooling_device_register(char *, void *, thermal_cooling_device_ops *); void thermal_cooling_device_unregister(struct thermal_cooling_device *); +extern int generate_netlink_event(u32 orig, enum events event); #endif /* __THERMAL_H__ */ -- cgit v1.2.3 From 6d855fcdd24d2491455527c4999b4d04363f1980 Mon Sep 17 00:00:00 2001 From: Zhang Rui Date: Mon, 10 Jan 2011 11:16:30 +0800 Subject: ACPI: delete CONFIG_ACPI_PROCFS_POWER and power procfs I/F in 2.6.39 sysfs I/F for ACPI power devices, including AC and Battery, has been working in upstream kenrel since 2.6.24, Sep 2007. In 2.6.37, we made the sysfs I/F always built in and this option disabled by default. Now, we plan to remove this option and the ACPI power procfs interface in 2.6.39. First, update the feature-removal-schedule to announce this change. Second, add runtime warnings in ACPI AC/Battery/SBS driver, so that users will notice this change even if "make oldconfig" is used. Signed-off-by: Zhang Rui Signed-off-by: Len Brown --- Documentation/feature-removal-schedule.txt | 11 +++++++++++ drivers/acpi/Kconfig | 2 ++ drivers/acpi/ac.c | 3 ++- drivers/acpi/battery.c | 2 ++ drivers/acpi/sbs.c | 2 ++ 5 files changed, 19 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index 6c2f55e05f13..f281532a15ce 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -232,6 +232,17 @@ Who: Zhang Rui --------------------------- +What: CONFIG_ACPI_PROCFS_POWER +When: 2.6.39 +Why: sysfs I/F for ACPI power devices, including AC and Battery, + has been working in upstream kenrel since 2.6.24, Sep 2007. + In 2.6.37, we make the sysfs I/F always built in and this option + disabled by default. + Remove this option and the ACPI power procfs interface in 2.6.39. +Who: Zhang Rui + +--------------------------- + What: /proc/acpi/button When: August 2007 Why: /proc/acpi/button has been replaced by events to the input layer diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index 5959077be0a4..788e88eb18ec 100644 --- a/drivers/acpi/Kconfig +++ b/drivers/acpi/Kconfig @@ -70,6 +70,8 @@ config ACPI_PROCFS_POWER /proc/acpi/ac_adapter/* (sys/class/power_supply/*) This option has no effect on /proc/acpi/ directories and functions, which do not yet exist in /sys + This option, together with the proc directories, will be + deleted in 2.6.39. Say N to delete power /proc/acpi/ directories that have moved to /sys/ diff --git a/drivers/acpi/ac.c b/drivers/acpi/ac.c index ba9afeaa23ac..f441e92481f6 100644 --- a/drivers/acpi/ac.c +++ b/drivers/acpi/ac.c @@ -185,7 +185,8 @@ static int acpi_ac_add_fs(struct acpi_device *device) { struct proc_dir_entry *entry = NULL; - + printk(KERN_WARNING PREFIX "Deprecated procfs I/F for AC is loaded," + " please retry with CONFIG_ACPI_PROCFS_POWER cleared\n"); if (!acpi_device_dir(device)) { acpi_device_dir(device) = proc_mkdir(acpi_device_bid(device), acpi_ac_dir); diff --git a/drivers/acpi/battery.c b/drivers/acpi/battery.c index 95649d373071..2a31421e0d75 100644 --- a/drivers/acpi/battery.c +++ b/drivers/acpi/battery.c @@ -868,6 +868,8 @@ static int acpi_battery_add_fs(struct acpi_device *device) struct proc_dir_entry *entry = NULL; int i; + printk(KERN_WARNING PREFIX "Deprecated procfs I/F for battery is loaded," + " please retry with CONFIG_ACPI_PROCFS_POWER cleared\n"); if (!acpi_device_dir(device)) { acpi_device_dir(device) = proc_mkdir(acpi_device_bid(device), acpi_battery_dir); diff --git a/drivers/acpi/sbs.c b/drivers/acpi/sbs.c index e5dbedb16bbf..51ae3794ec7f 100644 --- a/drivers/acpi/sbs.c +++ b/drivers/acpi/sbs.c @@ -484,6 +484,8 @@ acpi_sbs_add_fs(struct proc_dir_entry **dir, const struct file_operations *state_fops, const struct file_operations *alarm_fops, void *data) { + printk(KERN_WARNING PREFIX "Deprecated procfs I/F for SBS is loaded," + " please retry with CONFIG_ACPI_PROCFS_POWER cleared\n"); if (!*dir) { *dir = proc_mkdir(dir_name, parent_dir); if (!*dir) { -- cgit v1.2.3 From d1f9642381847e2b94caa34c3533211cf36ffcf4 Mon Sep 17 00:00:00 2001 From: Milan Broz Date: Thu, 13 Jan 2011 19:59:54 +0000 Subject: dm crypt: add multi key capability This patch adds generic multikey handling to be used in following patch for Loop-AES mode compatibility. This patch extends mapping table to optional keycount and implements generic multi-key capability. With more keys defined the string is divided into several sections and these are used for tfms. The tfm is used according to sector offset (sector 0->tfm[0], sector 1->tfm[1], sector N->tfm[N modulo keycount]) (only power of two values supported for keycount here). Because of tfms per-cpu allocation, this mode can be take a lot of memory on large smp systems. Signed-off-by: Milan Broz Signed-off-by: Alasdair G Kergon Cc: Max Vozeler --- Documentation/device-mapper/dm-crypt.txt | 7 ++- drivers/md/dm-crypt.c | 85 ++++++++++++++++++++++++-------- 2 files changed, 70 insertions(+), 22 deletions(-) (limited to 'Documentation') diff --git a/Documentation/device-mapper/dm-crypt.txt b/Documentation/device-mapper/dm-crypt.txt index 524de926290d..59293ac4a5d0 100644 --- a/Documentation/device-mapper/dm-crypt.txt +++ b/Documentation/device-mapper/dm-crypt.txt @@ -8,7 +8,7 @@ Parameters: Encryption cipher and an optional IV generation mode. - (In format cipher-chainmode-ivopts:ivmode). + (In format cipher[:keycount]-chainmode-ivopts:ivmode). Examples: des aes-cbc-essiv:sha256 @@ -20,6 +20,11 @@ Parameters: Key used for encryption. It is encoded as a hexadecimal number. You can only use key sizes that are valid for the selected cipher. + + Multi-key compatibility mode. You can define keys and + then sectors are encrypted according to their offsets (sector 0 uses key0; + sector 1 uses key1 etc.). must be a power of two. + The IV offset is a sector count that is added to the sector number before creating the IV. diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c index e0ebe685be6a..b8b9267c4dbb 100644 --- a/drivers/md/dm-crypt.c +++ b/drivers/md/dm-crypt.c @@ -101,10 +101,9 @@ enum flags { DM_CRYPT_SUSPENDED, DM_CRYPT_KEY_VALID }; */ struct crypt_cpu { struct ablkcipher_request *req; - struct crypto_ablkcipher *tfm; - /* ESSIV: struct crypto_cipher *essiv_tfm */ void *iv_private; + struct crypto_ablkcipher *tfms[0]; }; /* @@ -143,6 +142,7 @@ struct crypt_config { * per_cpu_ptr() only. */ struct crypt_cpu __percpu *cpu; + unsigned tfms_count; /* * Layout of each crypto request: @@ -161,6 +161,7 @@ struct crypt_config { unsigned long flags; unsigned int key_size; + unsigned int key_parts; u8 key[0]; }; @@ -184,7 +185,7 @@ static struct crypt_cpu *this_crypt_config(struct crypt_config *cc) */ static struct crypto_ablkcipher *any_tfm(struct crypt_config *cc) { - return __this_cpu_ptr(cc->cpu)->tfm; + return __this_cpu_ptr(cc->cpu)->tfms[0]; } /* @@ -567,11 +568,12 @@ static void crypt_alloc_req(struct crypt_config *cc, struct convert_context *ctx) { struct crypt_cpu *this_cc = this_crypt_config(cc); + unsigned key_index = ctx->sector & (cc->tfms_count - 1); if (!this_cc->req) this_cc->req = mempool_alloc(cc->req_pool, GFP_NOIO); - ablkcipher_request_set_tfm(this_cc->req, this_cc->tfm); + ablkcipher_request_set_tfm(this_cc->req, this_cc->tfms[key_index]); ablkcipher_request_set_callback(this_cc->req, CRYPTO_TFM_REQ_MAY_BACKLOG | CRYPTO_TFM_REQ_MAY_SLEEP, kcryptd_async_done, dmreq_of_req(cc, this_cc->req)); @@ -1097,15 +1099,48 @@ static void crypt_encode_key(char *hex, u8 *key, unsigned int size) } } +static void crypt_free_tfms(struct crypt_config *cc, int cpu) +{ + struct crypt_cpu *cpu_cc = per_cpu_ptr(cc->cpu, cpu); + unsigned i; + + for (i = 0; i < cc->tfms_count; i++) + if (cpu_cc->tfms[i] && !IS_ERR(cpu_cc->tfms[i])) { + crypto_free_ablkcipher(cpu_cc->tfms[i]); + cpu_cc->tfms[i] = NULL; + } +} + +static int crypt_alloc_tfms(struct crypt_config *cc, int cpu, char *ciphermode) +{ + struct crypt_cpu *cpu_cc = per_cpu_ptr(cc->cpu, cpu); + unsigned i; + int err; + + for (i = 0; i < cc->tfms_count; i++) { + cpu_cc->tfms[i] = crypto_alloc_ablkcipher(ciphermode, 0, 0); + if (IS_ERR(cpu_cc->tfms[i])) { + err = PTR_ERR(cpu_cc->tfms[i]); + crypt_free_tfms(cc, cpu); + return err; + } + } + + return 0; +} + static int crypt_setkey_allcpus(struct crypt_config *cc) { - int cpu, err = 0, r; + unsigned subkey_size = cc->key_size >> ilog2(cc->tfms_count); + int cpu, err = 0, i, r; for_each_possible_cpu(cpu) { - r = crypto_ablkcipher_setkey(per_cpu_ptr(cc->cpu, cpu)->tfm, - cc->key, cc->key_size); - if (r) - err = r; + for (i = 0; i < cc->tfms_count; i++) { + r = crypto_ablkcipher_setkey(per_cpu_ptr(cc->cpu, cpu)->tfms[i], + cc->key + (i * subkey_size), subkey_size); + if (r) + err = r; + } } return err; @@ -1158,8 +1193,7 @@ static void crypt_dtr(struct dm_target *ti) cpu_cc = per_cpu_ptr(cc->cpu, cpu); if (cpu_cc->req) mempool_free(cpu_cc->req, cc->req_pool); - if (cpu_cc->tfm) - crypto_free_ablkcipher(cpu_cc->tfm); + crypt_free_tfms(cc, cpu); } if (cc->bs) @@ -1192,8 +1226,7 @@ static int crypt_ctr_cipher(struct dm_target *ti, char *cipher_in, char *key) { struct crypt_config *cc = ti->private; - struct crypto_ablkcipher *tfm; - char *tmp, *cipher, *chainmode, *ivmode, *ivopts; + char *tmp, *cipher, *chainmode, *ivmode, *ivopts, *keycount; char *cipher_api = NULL; int cpu, ret = -EINVAL; @@ -1209,10 +1242,20 @@ static int crypt_ctr_cipher(struct dm_target *ti, /* * Legacy dm-crypt cipher specification - * cipher-mode-iv:ivopts + * cipher[:keycount]-mode-iv:ivopts */ tmp = cipher_in; - cipher = strsep(&tmp, "-"); + keycount = strsep(&tmp, "-"); + cipher = strsep(&keycount, ":"); + + if (!keycount) + cc->tfms_count = 1; + else if (sscanf(keycount, "%u", &cc->tfms_count) != 1 || + !is_power_of_2(cc->tfms_count)) { + ti->error = "Bad cipher key count specification"; + return -EINVAL; + } + cc->key_parts = cc->tfms_count; cc->cipher = kstrdup(cipher, GFP_KERNEL); if (!cc->cipher) @@ -1225,7 +1268,9 @@ static int crypt_ctr_cipher(struct dm_target *ti, if (tmp) DMWARN("Ignoring unexpected additional cipher options"); - cc->cpu = alloc_percpu(struct crypt_cpu); + cc->cpu = __alloc_percpu(sizeof(*(cc->cpu)) + + cc->tfms_count * sizeof(*(cc->cpu->tfms)), + __alignof__(struct crypt_cpu)); if (!cc->cpu) { ti->error = "Cannot allocate per cpu state"; goto bad_mem; @@ -1258,13 +1303,11 @@ static int crypt_ctr_cipher(struct dm_target *ti, /* Allocate cipher */ for_each_possible_cpu(cpu) { - tfm = crypto_alloc_ablkcipher(cipher_api, 0, 0); - if (IS_ERR(tfm)) { - ret = PTR_ERR(tfm); + ret = crypt_alloc_tfms(cc, cpu, cipher_api); + if (ret < 0) { ti->error = "Error allocating crypto tfm"; goto bad; } - per_cpu_ptr(cc->cpu, cpu)->tfm = tfm; } /* Initialize and set key */ @@ -1587,7 +1630,7 @@ static int crypt_iterate_devices(struct dm_target *ti, static struct target_type crypt_target = { .name = "crypt", - .version = {1, 9, 0}, + .version = {1, 10, 0}, .module = THIS_MODULE, .ctr = crypt_ctr, .dtr = crypt_dtr, -- cgit v1.2.3 From 9d09e663d5502c46f2d9481c04c1087e1c2da698 Mon Sep 17 00:00:00 2001 From: NeilBrown Date: Thu, 13 Jan 2011 20:00:02 +0000 Subject: dm: raid456 basic support This patch is the skeleton for the DM target that will be the bridge from DM to MD (initially RAID456 and later RAID1). It provides a way to use device-mapper interfaces to the MD RAID456 drivers. As with all device-mapper targets, the nominal public interfaces are the constructor (CTR) tables and the status outputs (both STATUSTYPE_INFO and STATUSTYPE_TABLE). The CTR table looks like the following: 1: raid \ 2: <#raid_params> \ 3: <#raid_devs> .. Line 1 contains the standard first three arguments to any device-mapper target - the start, length, and target type fields. The target type in this case is "raid". Line 2 contains the arguments that define the particular raid type/personality/level, the required arguments for that raid type, and any optional arguments. Possible raid types include: raid4, raid5_la, raid5_ls, raid5_rs, raid6_zr, raid6_nr, and raid6_nc. (again, raid1 is planned for the future.) The list of required and optional parameters is the same for all the current raid types. The required parameters are positional, while the optional parameters are given as key/value pairs. The possible parameters are as follows: Chunk size in sectors. [[no]sync] Force/Prevent RAID initialization [rebuild ] Rebuild the drive indicated by the index [daemon_sleep ] Time between bitmap daemon work to clear bits [min_recovery_rate ] Throttle RAID initialization [max_recovery_rate ] Throttle RAID initialization [max_write_behind ] See '-write-behind=' (man mdadm) [stripe_cache ] Stripe cache size for higher RAIDs Line 3 contains the list of devices that compose the array in metadata/data device pairs. If the metadata is stored separately, a '-' is given for the metadata device position. If a drive has failed or is missing at creation time, a '-' can be given for both the metadata and data drives for a given position. Examples: # RAID4 - 4 data drives, 1 parity # No metadata devices specified to hold superblock/bitmap info # Chunk size of 1MiB # (Lines separated for easy reading) 0 1960893648 raid \ raid4 1 2048 \ 5 - 8:17 - 8:33 - 8:49 - 8:65 - 8:81 # RAID4 - 4 data drives, 1 parity (no metadata devices) # Chunk size of 1MiB, force RAID initialization, # min recovery rate at 20 kiB/sec/disk 0 1960893648 raid \ raid4 4 2048 min_recovery_rate 20 sync\ 5 - 8:17 - 8:33 - 8:49 - 8:65 - 8:81 Performing a 'dmsetup table' should display the CTR table used to construct the mapping (with possible reordering of optional parameters). Performing a 'dmsetup status' will yield information on the state and health of the array. The output is as follows: 1: raid \ 2: <#devices> <1 health char for each dev> Line 1 is standard DM output. Line 2 is best shown by example: 0 1960893648 raid raid4 5 AAAAA 2/490221568 Here we can see the RAID type is raid4, there are 5 devices - all of which are 'A'live, and the array is 2/490221568 complete with recovery. Cc: linux-raid@vger.kernel.org Signed-off-by: NeilBrown Signed-off-by: Jonathan Brassow Signed-off-by: Mike Snitzer Signed-off-by: Alasdair G Kergon --- Documentation/device-mapper/dm-raid.txt | 70 ++++ drivers/md/Kconfig | 24 ++ drivers/md/Makefile | 1 + drivers/md/dm-raid.c | 697 ++++++++++++++++++++++++++++++++ 4 files changed, 792 insertions(+) create mode 100644 Documentation/device-mapper/dm-raid.txt create mode 100644 drivers/md/dm-raid.c (limited to 'Documentation') diff --git a/Documentation/device-mapper/dm-raid.txt b/Documentation/device-mapper/dm-raid.txt new file mode 100644 index 000000000000..33b6b7071ac8 --- /dev/null +++ b/Documentation/device-mapper/dm-raid.txt @@ -0,0 +1,70 @@ +Device-mapper RAID (dm-raid) is a bridge from DM to MD. It +provides a way to use device-mapper interfaces to access the MD RAID +drivers. + +As with all device-mapper targets, the nominal public interfaces are the +constructor (CTR) tables and the status outputs (both STATUSTYPE_INFO +and STATUSTYPE_TABLE). The CTR table looks like the following: + +1: raid \ +2: <#raid_params> \ +3: <#raid_devs> .. + +Line 1 contains the standard first three arguments to any device-mapper +target - the start, length, and target type fields. The target type in +this case is "raid". + +Line 2 contains the arguments that define the particular raid +type/personality/level, the required arguments for that raid type, and +any optional arguments. Possible raid types include: raid4, raid5_la, +raid5_ls, raid5_rs, raid6_zr, raid6_nr, and raid6_nc. (raid1 is +planned for the future.) The list of required and optional parameters +is the same for all the current raid types. The required parameters are +positional, while the optional parameters are given as key/value pairs. +The possible parameters are as follows: + Chunk size in sectors. + [[no]sync] Force/Prevent RAID initialization + [rebuild ] Rebuild the drive indicated by the index + [daemon_sleep ] Time between bitmap daemon work to clear bits + [min_recovery_rate ] Throttle RAID initialization + [max_recovery_rate ] Throttle RAID initialization + [max_write_behind ] See '-write-behind=' (man mdadm) + [stripe_cache ] Stripe cache size for higher RAIDs + +Line 3 contains the list of devices that compose the array in +metadata/data device pairs. If the metadata is stored separately, a '-' +is given for the metadata device position. If a drive has failed or is +missing at creation time, a '-' can be given for both the metadata and +data drives for a given position. + +NB. Currently all metadata devices must be specified as '-'. + +Examples: +# RAID4 - 4 data drives, 1 parity +# No metadata devices specified to hold superblock/bitmap info +# Chunk size of 1MiB +# (Lines separated for easy reading) +0 1960893648 raid \ + raid4 1 2048 \ + 5 - 8:17 - 8:33 - 8:49 - 8:65 - 8:81 + +# RAID4 - 4 data drives, 1 parity (no metadata devices) +# Chunk size of 1MiB, force RAID initialization, +# min recovery rate at 20 kiB/sec/disk +0 1960893648 raid \ + raid4 4 2048 min_recovery_rate 20 sync\ + 5 - 8:17 - 8:33 - 8:49 - 8:65 - 8:81 + +Performing a 'dmsetup table' should display the CTR table used to +construct the mapping (with possible reordering of optional +parameters). + +Performing a 'dmsetup status' will yield information on the state and +health of the array. The output is as follows: +1: raid \ +2: <#devices> <1 health char for each dev> + +Line 1 is standard DM output. Line 2 is best shown by example: + 0 1960893648 raid raid4 5 AAAAA 2/490221568 +Here we can see the RAID type is raid4, there are 5 devices - all of +which are 'A'live, and the array is 2/490221568 complete with recovery. diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig index bf1a95e31559..98d9ec85e0eb 100644 --- a/drivers/md/Kconfig +++ b/drivers/md/Kconfig @@ -240,6 +240,30 @@ config DM_MIRROR Allow volume managers to mirror logical volumes, also needed for live data migration tools such as 'pvmove'. +config DM_RAID + tristate "RAID 4/5/6 target (EXPERIMENTAL)" + depends on BLK_DEV_DM && EXPERIMENTAL + select MD_RAID456 + select BLK_DEV_MD + ---help--- + A dm target that supports RAID4, RAID5 and RAID6 mappings + + A RAID-5 set of N drives with a capacity of C MB per drive provides + the capacity of C * (N - 1) MB, and protects against a failure + of a single drive. For a given sector (row) number, (N - 1) drives + contain data sectors, and one drive contains the parity protection. + For a RAID-4 set, the parity blocks are present on a single drive, + while a RAID-5 set distributes the parity across the drives in one + of the available parity distribution methods. + + A RAID-6 set of N drives with a capacity of C MB per drive + provides the capacity of C * (N - 2) MB, and protects + against a failure of any two drives. For a given sector + (row) number, (N - 2) drives contain data sectors, and two + drives contains two independent redundancy syndromes. Like + RAID-5, RAID-6 distributes the syndromes across the drives + in one of the available parity distribution methods. + config DM_LOG_USERSPACE tristate "Mirror userspace logging (EXPERIMENTAL)" depends on DM_MIRROR && EXPERIMENTAL && NET diff --git a/drivers/md/Makefile b/drivers/md/Makefile index 5e3aac41919d..d0138606c2e8 100644 --- a/drivers/md/Makefile +++ b/drivers/md/Makefile @@ -36,6 +36,7 @@ obj-$(CONFIG_DM_SNAPSHOT) += dm-snapshot.o obj-$(CONFIG_DM_MIRROR) += dm-mirror.o dm-log.o dm-region-hash.o obj-$(CONFIG_DM_LOG_USERSPACE) += dm-log-userspace.o obj-$(CONFIG_DM_ZERO) += dm-zero.o +obj-$(CONFIG_DM_RAID) += dm-raid.o ifeq ($(CONFIG_DM_UEVENT),y) dm-mod-objs += dm-uevent.o diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c new file mode 100644 index 000000000000..b9e1e15ef11c --- /dev/null +++ b/drivers/md/dm-raid.c @@ -0,0 +1,697 @@ +/* + * Copyright (C) 2010-2011 Neil Brown + * Copyright (C) 2010-2011 Red Hat, Inc. All rights reserved. + * + * This file is released under the GPL. + */ + +#include + +#include "md.h" +#include "raid5.h" +#include "dm.h" +#include "bitmap.h" + +#define DM_MSG_PREFIX "raid" + +/* + * If the MD doesn't support MD_SYNC_STATE_FORCED yet, then + * make it so the flag doesn't set anything. + */ +#ifndef MD_SYNC_STATE_FORCED +#define MD_SYNC_STATE_FORCED 0 +#endif + +struct raid_dev { + /* + * Two DM devices, one to hold metadata and one to hold the + * actual data/parity. The reason for this is to not confuse + * ti->len and give more flexibility in altering size and + * characteristics. + * + * While it is possible for this device to be associated + * with a different physical device than the data_dev, it + * is intended for it to be the same. + * |--------- Physical Device ---------| + * |- meta_dev -|------ data_dev ------| + */ + struct dm_dev *meta_dev; + struct dm_dev *data_dev; + struct mdk_rdev_s rdev; +}; + +/* + * Flags for rs->print_flags field. + */ +#define DMPF_DAEMON_SLEEP 0x1 +#define DMPF_MAX_WRITE_BEHIND 0x2 +#define DMPF_SYNC 0x4 +#define DMPF_NOSYNC 0x8 +#define DMPF_STRIPE_CACHE 0x10 +#define DMPF_MIN_RECOVERY_RATE 0x20 +#define DMPF_MAX_RECOVERY_RATE 0x40 + +struct raid_set { + struct dm_target *ti; + + uint64_t print_flags; + + struct mddev_s md; + struct raid_type *raid_type; + struct dm_target_callbacks callbacks; + + struct raid_dev dev[0]; +}; + +/* Supported raid types and properties. */ +static struct raid_type { + const char *name; /* RAID algorithm. */ + const char *descr; /* Descriptor text for logging. */ + const unsigned parity_devs; /* # of parity devices. */ + const unsigned minimal_devs; /* minimal # of devices in set. */ + const unsigned level; /* RAID level. */ + const unsigned algorithm; /* RAID algorithm. */ +} raid_types[] = { + {"raid4", "RAID4 (dedicated parity disk)", 1, 2, 5, ALGORITHM_PARITY_0}, + {"raid5_la", "RAID5 (left asymmetric)", 1, 2, 5, ALGORITHM_LEFT_ASYMMETRIC}, + {"raid5_ra", "RAID5 (right asymmetric)", 1, 2, 5, ALGORITHM_RIGHT_ASYMMETRIC}, + {"raid5_ls", "RAID5 (left symmetric)", 1, 2, 5, ALGORITHM_LEFT_SYMMETRIC}, + {"raid5_rs", "RAID5 (right symmetric)", 1, 2, 5, ALGORITHM_RIGHT_SYMMETRIC}, + {"raid6_zr", "RAID6 (zero restart)", 2, 4, 6, ALGORITHM_ROTATING_ZERO_RESTART}, + {"raid6_nr", "RAID6 (N restart)", 2, 4, 6, ALGORITHM_ROTATING_N_RESTART}, + {"raid6_nc", "RAID6 (N continue)", 2, 4, 6, ALGORITHM_ROTATING_N_CONTINUE} +}; + +static struct raid_type *get_raid_type(char *name) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(raid_types); i++) + if (!strcmp(raid_types[i].name, name)) + return &raid_types[i]; + + return NULL; +} + +static struct raid_set *context_alloc(struct dm_target *ti, struct raid_type *raid_type, unsigned raid_devs) +{ + unsigned i; + struct raid_set *rs; + sector_t sectors_per_dev; + + if (raid_devs <= raid_type->parity_devs) { + ti->error = "Insufficient number of devices"; + return ERR_PTR(-EINVAL); + } + + sectors_per_dev = ti->len; + if (sector_div(sectors_per_dev, (raid_devs - raid_type->parity_devs))) { + ti->error = "Target length not divisible by number of data devices"; + return ERR_PTR(-EINVAL); + } + + rs = kzalloc(sizeof(*rs) + raid_devs * sizeof(rs->dev[0]), GFP_KERNEL); + if (!rs) { + ti->error = "Cannot allocate raid context"; + return ERR_PTR(-ENOMEM); + } + + mddev_init(&rs->md); + + rs->ti = ti; + rs->raid_type = raid_type; + rs->md.raid_disks = raid_devs; + rs->md.level = raid_type->level; + rs->md.new_level = rs->md.level; + rs->md.dev_sectors = sectors_per_dev; + rs->md.layout = raid_type->algorithm; + rs->md.new_layout = rs->md.layout; + rs->md.delta_disks = 0; + rs->md.recovery_cp = 0; + + for (i = 0; i < raid_devs; i++) + md_rdev_init(&rs->dev[i].rdev); + + /* + * Remaining items to be initialized by further RAID params: + * rs->md.persistent + * rs->md.external + * rs->md.chunk_sectors + * rs->md.new_chunk_sectors + */ + + return rs; +} + +static void context_free(struct raid_set *rs) +{ + int i; + + for (i = 0; i < rs->md.raid_disks; i++) + if (rs->dev[i].data_dev) + dm_put_device(rs->ti, rs->dev[i].data_dev); + + kfree(rs); +} + +/* + * For every device we have two words + * : meta device name or '-' if missing + * : data device name or '-' if missing + * + * This code parses those words. + */ +static int dev_parms(struct raid_set *rs, char **argv) +{ + int i; + int rebuild = 0; + int metadata_available = 0; + int ret = 0; + + for (i = 0; i < rs->md.raid_disks; i++, argv += 2) { + rs->dev[i].rdev.raid_disk = i; + + rs->dev[i].meta_dev = NULL; + rs->dev[i].data_dev = NULL; + + /* + * There are no offsets, since there is a separate device + * for data and metadata. + */ + rs->dev[i].rdev.data_offset = 0; + rs->dev[i].rdev.mddev = &rs->md; + + if (strcmp(argv[0], "-")) { + rs->ti->error = "Metadata devices not supported"; + return -EINVAL; + } + + if (!strcmp(argv[1], "-")) { + if (!test_bit(In_sync, &rs->dev[i].rdev.flags) && + (!rs->dev[i].rdev.recovery_offset)) { + rs->ti->error = "Drive designated for rebuild not specified"; + return -EINVAL; + } + + continue; + } + + ret = dm_get_device(rs->ti, argv[1], + dm_table_get_mode(rs->ti->table), + &rs->dev[i].data_dev); + if (ret) { + rs->ti->error = "RAID device lookup failure"; + return ret; + } + + rs->dev[i].rdev.bdev = rs->dev[i].data_dev->bdev; + list_add(&rs->dev[i].rdev.same_set, &rs->md.disks); + if (!test_bit(In_sync, &rs->dev[i].rdev.flags)) + rebuild++; + } + + if (metadata_available) { + rs->md.external = 0; + rs->md.persistent = 1; + rs->md.major_version = 2; + } else if (rebuild && !rs->md.recovery_cp) { + /* + * Without metadata, we will not be able to tell if the array + * is in-sync or not - we must assume it is not. Therefore, + * it is impossible to rebuild a drive. + * + * Even if there is metadata, the on-disk information may + * indicate that the array is not in-sync and it will then + * fail at that time. + * + * User could specify 'nosync' option if desperate. + */ + DMERR("Unable to rebuild drive while array is not in-sync"); + rs->ti->error = "RAID device lookup failure"; + return -EINVAL; + } + + return 0; +} + +/* + * Possible arguments are... + * RAID456: + * [optional_args] + * + * Optional args: + * [[no]sync] Force or prevent recovery of the entire array + * [rebuild ] Rebuild the drive indicated by the index + * [daemon_sleep ] Time between bitmap daemon work to clear bits + * [min_recovery_rate ] Throttle RAID initialization + * [max_recovery_rate ] Throttle RAID initialization + * [max_write_behind ] See '-write-behind=' (man mdadm) + * [stripe_cache ] Stripe cache size for higher RAIDs + */ +static int parse_raid_params(struct raid_set *rs, char **argv, + unsigned num_raid_params) +{ + unsigned i, rebuild_cnt = 0; + unsigned long value; + char *key; + + /* + * First, parse the in-order required arguments + */ + if ((strict_strtoul(argv[0], 10, &value) < 0) || + !is_power_of_2(value) || (value < 8)) { + rs->ti->error = "Bad chunk size"; + return -EINVAL; + } + + rs->md.new_chunk_sectors = rs->md.chunk_sectors = value; + argv++; + num_raid_params--; + + /* + * Second, parse the unordered optional arguments + */ + for (i = 0; i < rs->md.raid_disks; i++) + set_bit(In_sync, &rs->dev[i].rdev.flags); + + for (i = 0; i < num_raid_params; i++) { + if (!strcmp(argv[i], "nosync")) { + rs->md.recovery_cp = MaxSector; + rs->print_flags |= DMPF_NOSYNC; + rs->md.flags |= MD_SYNC_STATE_FORCED; + continue; + } + if (!strcmp(argv[i], "sync")) { + rs->md.recovery_cp = 0; + rs->print_flags |= DMPF_SYNC; + rs->md.flags |= MD_SYNC_STATE_FORCED; + continue; + } + + /* The rest of the optional arguments come in key/value pairs */ + if ((i + 1) >= num_raid_params) { + rs->ti->error = "Wrong number of raid parameters given"; + return -EINVAL; + } + + key = argv[i++]; + if (strict_strtoul(argv[i], 10, &value) < 0) { + rs->ti->error = "Bad numerical argument given in raid params"; + return -EINVAL; + } + + if (!strcmp(key, "rebuild")) { + if (++rebuild_cnt > rs->raid_type->parity_devs) { + rs->ti->error = "Too many rebuild drives given"; + return -EINVAL; + } + if (value > rs->md.raid_disks) { + rs->ti->error = "Invalid rebuild index given"; + return -EINVAL; + } + clear_bit(In_sync, &rs->dev[value].rdev.flags); + rs->dev[value].rdev.recovery_offset = 0; + } else if (!strcmp(key, "max_write_behind")) { + rs->print_flags |= DMPF_MAX_WRITE_BEHIND; + + /* + * In device-mapper, we specify things in sectors, but + * MD records this value in kB + */ + value /= 2; + if (value > COUNTER_MAX) { + rs->ti->error = "Max write-behind limit out of range"; + return -EINVAL; + } + rs->md.bitmap_info.max_write_behind = value; + } else if (!strcmp(key, "daemon_sleep")) { + rs->print_flags |= DMPF_DAEMON_SLEEP; + if (!value || (value > MAX_SCHEDULE_TIMEOUT)) { + rs->ti->error = "daemon sleep period out of range"; + return -EINVAL; + } + rs->md.bitmap_info.daemon_sleep = value; + } else if (!strcmp(key, "stripe_cache")) { + rs->print_flags |= DMPF_STRIPE_CACHE; + + /* + * In device-mapper, we specify things in sectors, but + * MD records this value in kB + */ + value /= 2; + + if (rs->raid_type->level < 5) { + rs->ti->error = "Inappropriate argument: stripe_cache"; + return -EINVAL; + } + if (raid5_set_cache_size(&rs->md, (int)value)) { + rs->ti->error = "Bad stripe_cache size"; + return -EINVAL; + } + } else if (!strcmp(key, "min_recovery_rate")) { + rs->print_flags |= DMPF_MIN_RECOVERY_RATE; + if (value > INT_MAX) { + rs->ti->error = "min_recovery_rate out of range"; + return -EINVAL; + } + rs->md.sync_speed_min = (int)value; + } else if (!strcmp(key, "max_recovery_rate")) { + rs->print_flags |= DMPF_MAX_RECOVERY_RATE; + if (value > INT_MAX) { + rs->ti->error = "max_recovery_rate out of range"; + return -EINVAL; + } + rs->md.sync_speed_max = (int)value; + } else { + DMERR("Unable to parse RAID parameter: %s", key); + rs->ti->error = "Unable to parse RAID parameters"; + return -EINVAL; + } + } + + /* Assume there are no metadata devices until the drives are parsed */ + rs->md.persistent = 0; + rs->md.external = 1; + + return 0; +} + +static void do_table_event(struct work_struct *ws) +{ + struct raid_set *rs = container_of(ws, struct raid_set, md.event_work); + + dm_table_event(rs->ti->table); +} + +static int raid_is_congested(struct dm_target_callbacks *cb, int bits) +{ + struct raid_set *rs = container_of(cb, struct raid_set, callbacks); + + return md_raid5_congested(&rs->md, bits); +} + +static void raid_unplug(struct dm_target_callbacks *cb) +{ + struct raid_set *rs = container_of(cb, struct raid_set, callbacks); + + md_raid5_unplug_device(rs->md.private); +} + +/* + * Construct a RAID4/5/6 mapping: + * Args: + * <#raid_params> \ + * <#raid_devs> { .. } + * + * ** metadata devices are not supported yet, use '-' instead ** + * + * varies by . See 'parse_raid_params' for + * details on possible . + */ +static int raid_ctr(struct dm_target *ti, unsigned argc, char **argv) +{ + int ret; + struct raid_type *rt; + unsigned long num_raid_params, num_raid_devs; + struct raid_set *rs = NULL; + + /* Must have at least <#raid_params> */ + if (argc < 2) { + ti->error = "Too few arguments"; + return -EINVAL; + } + + /* raid type */ + rt = get_raid_type(argv[0]); + if (!rt) { + ti->error = "Unrecognised raid_type"; + return -EINVAL; + } + argc--; + argv++; + + /* number of RAID parameters */ + if (strict_strtoul(argv[0], 10, &num_raid_params) < 0) { + ti->error = "Cannot understand number of RAID parameters"; + return -EINVAL; + } + argc--; + argv++; + + /* Skip over RAID params for now and find out # of devices */ + if (num_raid_params + 1 > argc) { + ti->error = "Arguments do not agree with counts given"; + return -EINVAL; + } + + if ((strict_strtoul(argv[num_raid_params], 10, &num_raid_devs) < 0) || + (num_raid_devs >= INT_MAX)) { + ti->error = "Cannot understand number of raid devices"; + return -EINVAL; + } + + rs = context_alloc(ti, rt, (unsigned)num_raid_devs); + if (IS_ERR(rs)) + return PTR_ERR(rs); + + ret = parse_raid_params(rs, argv, (unsigned)num_raid_params); + if (ret) + goto bad; + + ret = -EINVAL; + + argc -= num_raid_params + 1; /* +1: we already have num_raid_devs */ + argv += num_raid_params + 1; + + if (argc != (num_raid_devs * 2)) { + ti->error = "Supplied RAID devices does not match the count given"; + goto bad; + } + + ret = dev_parms(rs, argv); + if (ret) + goto bad; + + INIT_WORK(&rs->md.event_work, do_table_event); + ti->split_io = rs->md.chunk_sectors; + ti->private = rs; + + mutex_lock(&rs->md.reconfig_mutex); + ret = md_run(&rs->md); + rs->md.in_sync = 0; /* Assume already marked dirty */ + mutex_unlock(&rs->md.reconfig_mutex); + + if (ret) { + ti->error = "Fail to run raid array"; + goto bad; + } + + rs->callbacks.congested_fn = raid_is_congested; + rs->callbacks.unplug_fn = raid_unplug; + dm_table_add_target_callbacks(ti->table, &rs->callbacks); + + return 0; + +bad: + context_free(rs); + + return ret; +} + +static void raid_dtr(struct dm_target *ti) +{ + struct raid_set *rs = ti->private; + + list_del_init(&rs->callbacks.list); + md_stop(&rs->md); + context_free(rs); +} + +static int raid_map(struct dm_target *ti, struct bio *bio, union map_info *map_context) +{ + struct raid_set *rs = ti->private; + mddev_t *mddev = &rs->md; + + mddev->pers->make_request(mddev, bio); + + return DM_MAPIO_SUBMITTED; +} + +static int raid_status(struct dm_target *ti, status_type_t type, + char *result, unsigned maxlen) +{ + struct raid_set *rs = ti->private; + unsigned raid_param_cnt = 1; /* at least 1 for chunksize */ + unsigned sz = 0; + int i; + sector_t sync; + + switch (type) { + case STATUSTYPE_INFO: + DMEMIT("%s %d ", rs->raid_type->name, rs->md.raid_disks); + + for (i = 0; i < rs->md.raid_disks; i++) { + if (test_bit(Faulty, &rs->dev[i].rdev.flags)) + DMEMIT("D"); + else if (test_bit(In_sync, &rs->dev[i].rdev.flags)) + DMEMIT("A"); + else + DMEMIT("a"); + } + + if (test_bit(MD_RECOVERY_RUNNING, &rs->md.recovery)) + sync = rs->md.curr_resync_completed; + else + sync = rs->md.recovery_cp; + + if (sync > rs->md.resync_max_sectors) + sync = rs->md.resync_max_sectors; + + DMEMIT(" %llu/%llu", + (unsigned long long) sync, + (unsigned long long) rs->md.resync_max_sectors); + + break; + case STATUSTYPE_TABLE: + /* The string you would use to construct this array */ + for (i = 0; i < rs->md.raid_disks; i++) + if (rs->dev[i].data_dev && + !test_bit(In_sync, &rs->dev[i].rdev.flags)) + raid_param_cnt++; /* for rebuilds */ + + raid_param_cnt += (hweight64(rs->print_flags) * 2); + if (rs->print_flags & (DMPF_SYNC | DMPF_NOSYNC)) + raid_param_cnt--; + + DMEMIT("%s %u %u", rs->raid_type->name, + raid_param_cnt, rs->md.chunk_sectors); + + if ((rs->print_flags & DMPF_SYNC) && + (rs->md.recovery_cp == MaxSector)) + DMEMIT(" sync"); + if (rs->print_flags & DMPF_NOSYNC) + DMEMIT(" nosync"); + + for (i = 0; i < rs->md.raid_disks; i++) + if (rs->dev[i].data_dev && + !test_bit(In_sync, &rs->dev[i].rdev.flags)) + DMEMIT(" rebuild %u", i); + + if (rs->print_flags & DMPF_DAEMON_SLEEP) + DMEMIT(" daemon_sleep %lu", + rs->md.bitmap_info.daemon_sleep); + + if (rs->print_flags & DMPF_MIN_RECOVERY_RATE) + DMEMIT(" min_recovery_rate %d", rs->md.sync_speed_min); + + if (rs->print_flags & DMPF_MAX_RECOVERY_RATE) + DMEMIT(" max_recovery_rate %d", rs->md.sync_speed_max); + + if (rs->print_flags & DMPF_MAX_WRITE_BEHIND) + DMEMIT(" max_write_behind %lu", + rs->md.bitmap_info.max_write_behind); + + if (rs->print_flags & DMPF_STRIPE_CACHE) { + raid5_conf_t *conf = rs->md.private; + + /* convert from kiB to sectors */ + DMEMIT(" stripe_cache %d", + conf ? conf->max_nr_stripes * 2 : 0); + } + + DMEMIT(" %d", rs->md.raid_disks); + for (i = 0; i < rs->md.raid_disks; i++) { + DMEMIT(" -"); /* metadata device */ + + if (rs->dev[i].data_dev) + DMEMIT(" %s", rs->dev[i].data_dev->name); + else + DMEMIT(" -"); + } + } + + return 0; +} + +static int raid_iterate_devices(struct dm_target *ti, iterate_devices_callout_fn fn, void *data) +{ + struct raid_set *rs = ti->private; + unsigned i; + int ret = 0; + + for (i = 0; !ret && i < rs->md.raid_disks; i++) + if (rs->dev[i].data_dev) + ret = fn(ti, + rs->dev[i].data_dev, + 0, /* No offset on data devs */ + rs->md.dev_sectors, + data); + + return ret; +} + +static void raid_io_hints(struct dm_target *ti, struct queue_limits *limits) +{ + struct raid_set *rs = ti->private; + unsigned chunk_size = rs->md.chunk_sectors << 9; + raid5_conf_t *conf = rs->md.private; + + blk_limits_io_min(limits, chunk_size); + blk_limits_io_opt(limits, chunk_size * (conf->raid_disks - conf->max_degraded)); +} + +static void raid_presuspend(struct dm_target *ti) +{ + struct raid_set *rs = ti->private; + + md_stop_writes(&rs->md); +} + +static void raid_postsuspend(struct dm_target *ti) +{ + struct raid_set *rs = ti->private; + + mddev_suspend(&rs->md); +} + +static void raid_resume(struct dm_target *ti) +{ + struct raid_set *rs = ti->private; + + mddev_resume(&rs->md); +} + +static struct target_type raid_target = { + .name = "raid", + .version = {1, 0, 0}, + .module = THIS_MODULE, + .ctr = raid_ctr, + .dtr = raid_dtr, + .map = raid_map, + .status = raid_status, + .iterate_devices = raid_iterate_devices, + .io_hints = raid_io_hints, + .presuspend = raid_presuspend, + .postsuspend = raid_postsuspend, + .resume = raid_resume, +}; + +static int __init dm_raid_init(void) +{ + return dm_register_target(&raid_target); +} + +static void __exit dm_raid_exit(void) +{ + dm_unregister_target(&raid_target); +} + +module_init(dm_raid_init); +module_exit(dm_raid_exit); + +MODULE_DESCRIPTION(DM_NAME " raid4/5/6 target"); +MODULE_ALIAS("dm-raid4"); +MODULE_ALIAS("dm-raid5"); +MODULE_ALIAS("dm-raid6"); +MODULE_AUTHOR("Neil Brown "); +MODULE_LICENSE("GPL"); -- cgit v1.2.3 From d8a3515e2a9523f8ed56d1f4537d16338bda2bc2 Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Thu, 13 Jan 2011 17:26:46 -0800 Subject: Revert "gpiolib: annotate gpio-intialization with __must_check" This reverts commit 0fdae42d361bbb431ca0ab0efed5126a94821177, which wasn't really supposed to go in, and causes lots of annoying warnings. Quoth Andrew: "Complete brainfart - I meant to drop that patch ages ago." Quoth Greg: "Ick, yeah, that patch isn't ok to go in as-is, all of the callers need to be fixed up first, which is what I thought we had agreed on..." Reported-by: Stephen Rothwell Acked-by: Andrew Morton Acked-by: Greg KH Signed-off-by: Linus Torvalds --- Documentation/gpio.txt | 2 +- include/asm-generic/gpio.h | 10 +++++----- include/linux/gpio.h | 6 +++--- 3 files changed, 9 insertions(+), 9 deletions(-) (limited to 'Documentation') diff --git a/Documentation/gpio.txt b/Documentation/gpio.txt index a492d92bb098..792faa3c06cf 100644 --- a/Documentation/gpio.txt +++ b/Documentation/gpio.txt @@ -135,7 +135,7 @@ setting up a platform_device using the GPIO, is mark its direction: int gpio_direction_input(unsigned gpio); int gpio_direction_output(unsigned gpio, int value); -The return value is zero for success, else a negative errno. It must +The return value is zero for success, else a negative errno. It should be checked, since the get/set calls don't have error returns and since misconfiguration is possible. You should normally issue these calls from a task context. However, for spinlock-safe GPIOs it's OK to use them diff --git a/include/asm-generic/gpio.h b/include/asm-generic/gpio.h index 6098cae2af8e..ff5c66080c8c 100644 --- a/include/asm-generic/gpio.h +++ b/include/asm-generic/gpio.h @@ -147,11 +147,11 @@ extern struct gpio_chip *gpiochip_find(void *data, /* Always use the library code for GPIO management calls, * or when sleeping may be involved. */ -extern int __must_check gpio_request(unsigned gpio, const char *label); +extern int gpio_request(unsigned gpio, const char *label); extern void gpio_free(unsigned gpio); -extern int __must_check gpio_direction_input(unsigned gpio); -extern int __must_check gpio_direction_output(unsigned gpio, int value); +extern int gpio_direction_input(unsigned gpio); +extern int gpio_direction_output(unsigned gpio, int value); extern int gpio_set_debounce(unsigned gpio, unsigned debounce); @@ -192,8 +192,8 @@ struct gpio { const char *label; }; -extern int __must_check gpio_request_one(unsigned gpio, unsigned long flags, const char *label); -extern int __must_check gpio_request_array(struct gpio *array, size_t num); +extern int gpio_request_one(unsigned gpio, unsigned long flags, const char *label); +extern int gpio_request_array(struct gpio *array, size_t num); extern void gpio_free_array(struct gpio *array, size_t num); #ifdef CONFIG_GPIO_SYSFS diff --git a/include/linux/gpio.h b/include/linux/gpio.h index f79d67f413e4..4b47ed96f131 100644 --- a/include/linux/gpio.h +++ b/include/linux/gpio.h @@ -30,7 +30,7 @@ static inline int gpio_is_valid(int number) return 0; } -static inline int __must_check gpio_request(unsigned gpio, const char *label) +static inline int gpio_request(unsigned gpio, const char *label) { return -ENOSYS; } @@ -62,12 +62,12 @@ static inline void gpio_free_array(struct gpio *array, size_t num) WARN_ON(1); } -static inline int __must_check gpio_direction_input(unsigned gpio) +static inline int gpio_direction_input(unsigned gpio) { return -ENOSYS; } -static inline int __must_check gpio_direction_output(unsigned gpio, int value) +static inline int gpio_direction_output(unsigned gpio, int value) { return -ENOSYS; } -- cgit v1.2.3 From 2d90508f638241a2e7422d884767398296ebe720 Mon Sep 17 00:00:00 2001 From: Nikanth Karthikesan Date: Thu, 13 Jan 2011 15:45:53 -0800 Subject: mm: smaps: export mlock information Currently there is no way to find whether a process has locked its pages in memory or not. And which of the memory regions are locked in memory. Add a new field "Locked" to export this information via the smaps file. Signed-off-by: Nikanth Karthikesan Acked-by: Balbir Singh Acked-by: Wu Fengguang Cc: Matt Mackall Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/filesystems/proc.txt | 3 +++ fs/proc/task_mmu.c | 7 +++++-- 2 files changed, 8 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index 9471225212c4..ef757fca470b 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -375,6 +375,7 @@ Anonymous: 0 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB +Locked: 374 kB The first of these lines shows the same information as is displayed for the mapping in /proc/PID/maps. The remaining lines show the size of the mapping @@ -670,6 +671,8 @@ varies by architecture and compile options. The following is from a > cat /proc/meminfo +The "Locked" indicates whether the mapping is locked in memory or not. + MemTotal: 16344972 kB MemFree: 13634064 kB diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index c3755bd8dd3e..60b914860f81 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -418,7 +418,8 @@ static int show_smap(struct seq_file *m, void *v) "Anonymous: %8lu kB\n" "Swap: %8lu kB\n" "KernelPageSize: %8lu kB\n" - "MMUPageSize: %8lu kB\n", + "MMUPageSize: %8lu kB\n" + "Locked: %8lu kB\n", (vma->vm_end - vma->vm_start) >> 10, mss.resident >> 10, (unsigned long)(mss.pss >> (10 + PSS_SHIFT)), @@ -430,7 +431,9 @@ static int show_smap(struct seq_file *m, void *v) mss.anonymous >> 10, mss.swap >> 10, vma_kernel_pagesize(vma) >> 10, - vma_mmu_pagesize(vma) >> 10); + vma_mmu_pagesize(vma) >> 10, + (vma->vm_flags & VM_LOCKED) ? + (unsigned long)(mss.pss >> (10 + PSS_SHIFT)) : 0); if (m->count < m->size) /* vma is copied successfully */ m->version = (vma != get_gate_vma(task)) ? vma->vm_start : 0; -- cgit v1.2.3 From dabb16f639820267b3850d804571c70bd93d4e07 Mon Sep 17 00:00:00 2001 From: Mandeep Singh Baines Date: Thu, 13 Jan 2011 15:46:05 -0800 Subject: oom: allow a non-CAP_SYS_RESOURCE proces to oom_score_adj down We'd like to be able to oom_score_adj a process up/down as it enters/leaves the foreground. Currently, it is not possible to oom_adj down without CAP_SYS_RESOURCE. This patch allows a task to decrease its oom_score_adj back to the value that a CAP_SYS_RESOURCE thread set it to or its inherited value at fork. Assuming the thread that has forked it has oom_score_adj of 0, each process could decrease it back from 0 upon activation unless a CAP_SYS_RESOURCE thread elevated it to something higher. Alternative considered: * a setuid binary * a daemon with CAP_SYS_RESOURCE Since you don't wan't all processes to be able to reduce their oom_adj, a setuid or daemon implementation would be complex. The alternatives also have much higher overhead. This patch updated from original patch based on feedback from David Rientjes. Signed-off-by: Mandeep Singh Baines Acked-by: David Rientjes Cc: KAMEZAWA Hiroyuki Cc: KOSAKI Motohiro Cc: Rik van Riel Cc: Ying Han Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/filesystems/proc.txt | 4 ++++ fs/proc/base.c | 4 +++- include/linux/sched.h | 2 ++ kernel/fork.c | 1 + 4 files changed, 10 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index ef757fca470b..23cae6548d3a 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -1323,6 +1323,10 @@ scaled linearly with /proc//oom_score_adj. Writing to /proc//oom_score_adj or /proc//oom_adj will change the other with its scaled value. +The value of /proc//oom_score_adj may be reduced no lower than the last +value set by a CAP_SYS_RESOURCE process. To reduce the value any lower +requires CAP_SYS_RESOURCE. + NOTICE: /proc//oom_adj is deprecated and will be removed, please see Documentation/feature-removal-schedule.txt. diff --git a/fs/proc/base.c b/fs/proc/base.c index 93f1cdd5d3d7..9d096e82b201 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -1151,7 +1151,7 @@ static ssize_t oom_score_adj_write(struct file *file, const char __user *buf, goto err_task_lock; } - if (oom_score_adj < task->signal->oom_score_adj && + if (oom_score_adj < task->signal->oom_score_adj_min && !capable(CAP_SYS_RESOURCE)) { err = -EACCES; goto err_sighand; @@ -1164,6 +1164,8 @@ static ssize_t oom_score_adj_write(struct file *file, const char __user *buf, atomic_dec(&task->mm->oom_disable_count); } task->signal->oom_score_adj = oom_score_adj; + if (has_capability_noaudit(current, CAP_SYS_RESOURCE)) + task->signal->oom_score_adj_min = oom_score_adj; /* * Scale /proc/pid/oom_adj appropriately ensuring that OOM_DISABLE is * always attainable. diff --git a/include/linux/sched.h b/include/linux/sched.h index 07402530fc70..f23b5bb6f52e 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -634,6 +634,8 @@ struct signal_struct { int oom_adj; /* OOM kill score adjustment (bit shift) */ int oom_score_adj; /* OOM kill score adjustment */ + int oom_score_adj_min; /* OOM kill score adjustment minimum value. + * Only settable by CAP_SYS_RESOURCE. */ struct mutex cred_guard_mutex; /* guard against foreign influences on * credential calculations diff --git a/kernel/fork.c b/kernel/fork.c index 76a1fdd80bdf..1499607e4da2 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -910,6 +910,7 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk) sig->oom_adj = current->signal->oom_adj; sig->oom_score_adj = current->signal->oom_score_adj; + sig->oom_score_adj_min = current->signal->oom_score_adj_min; mutex_init(&sig->cred_guard_mutex); -- cgit v1.2.3 From 1c9bf22c09ae14d65225d9b9619b2eb357350cd7 Mon Sep 17 00:00:00 2001 From: Andrea Arcangeli Date: Thu, 13 Jan 2011 15:46:30 -0800 Subject: thp: transparent hugepage support documentation Documentation/vm/transhuge.txt Signed-off-by: Andrea Arcangeli Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/vm/transhuge.txt | 298 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 298 insertions(+) create mode 100644 Documentation/vm/transhuge.txt (limited to 'Documentation') diff --git a/Documentation/vm/transhuge.txt b/Documentation/vm/transhuge.txt new file mode 100644 index 000000000000..0924aaca3302 --- /dev/null +++ b/Documentation/vm/transhuge.txt @@ -0,0 +1,298 @@ += Transparent Hugepage Support = + +== Objective == + +Performance critical computing applications dealing with large memory +working sets are already running on top of libhugetlbfs and in turn +hugetlbfs. Transparent Hugepage Support is an alternative means of +using huge pages for the backing of virtual memory with huge pages +that supports the automatic promotion and demotion of page sizes and +without the shortcomings of hugetlbfs. + +Currently it only works for anonymous memory mappings but in the +future it can expand over the pagecache layer starting with tmpfs. + +The reason applications are running faster is because of two +factors. The first factor is almost completely irrelevant and it's not +of significant interest because it'll also have the downside of +requiring larger clear-page copy-page in page faults which is a +potentially negative effect. The first factor consists in taking a +single page fault for each 2M virtual region touched by userland (so +reducing the enter/exit kernel frequency by a 512 times factor). This +only matters the first time the memory is accessed for the lifetime of +a memory mapping. The second long lasting and much more important +factor will affect all subsequent accesses to the memory for the whole +runtime of the application. The second factor consist of two +components: 1) the TLB miss will run faster (especially with +virtualization using nested pagetables but almost always also on bare +metal without virtualization) and 2) a single TLB entry will be +mapping a much larger amount of virtual memory in turn reducing the +number of TLB misses. With virtualization and nested pagetables the +TLB can be mapped of larger size only if both KVM and the Linux guest +are using hugepages but a significant speedup already happens if only +one of the two is using hugepages just because of the fact the TLB +miss is going to run faster. + +== Design == + +- "graceful fallback": mm components which don't have transparent + hugepage knowledge fall back to breaking a transparent hugepage and + working on the regular pages and their respective regular pmd/pte + mappings + +- if a hugepage allocation fails because of memory fragmentation, + regular pages should be gracefully allocated instead and mixed in + the same vma without any failure or significant delay and without + userland noticing + +- if some task quits and more hugepages become available (either + immediately in the buddy or through the VM), guest physical memory + backed by regular pages should be relocated on hugepages + automatically (with khugepaged) + +- it doesn't require memory reservation and in turn it uses hugepages + whenever possible (the only possible reservation here is kernelcore= + to avoid unmovable pages to fragment all the memory but such a tweak + is not specific to transparent hugepage support and it's a generic + feature that applies to all dynamic high order allocations in the + kernel) + +- this initial support only offers the feature in the anonymous memory + regions but it'd be ideal to move it to tmpfs and the pagecache + later + +Transparent Hugepage Support maximizes the usefulness of free memory +if compared to the reservation approach of hugetlbfs by allowing all +unused memory to be used as cache or other movable (or even unmovable +entities). It doesn't require reservation to prevent hugepage +allocation failures to be noticeable from userland. It allows paging +and all other advanced VM features to be available on the +hugepages. It requires no modifications for applications to take +advantage of it. + +Applications however can be further optimized to take advantage of +this feature, like for example they've been optimized before to avoid +a flood of mmap system calls for every malloc(4k). Optimizing userland +is by far not mandatory and khugepaged already can take care of long +lived page allocations even for hugepage unaware applications that +deals with large amounts of memory. + +In certain cases when hugepages are enabled system wide, application +may end up allocating more memory resources. An application may mmap a +large region but only touch 1 byte of it, in that case a 2M page might +be allocated instead of a 4k page for no good. This is why it's +possible to disable hugepages system-wide and to only have them inside +MADV_HUGEPAGE madvise regions. + +Embedded systems should enable hugepages only inside madvise regions +to eliminate any risk of wasting any precious byte of memory and to +only run faster. + +Applications that gets a lot of benefit from hugepages and that don't +risk to lose memory by using hugepages, should use +madvise(MADV_HUGEPAGE) on their critical mmapped regions. + +== sysfs == + +Transparent Hugepage Support can be entirely disabled (mostly for +debugging purposes) or only enabled inside MADV_HUGEPAGE regions (to +avoid the risk of consuming more memory resources) or enabled system +wide. This can be achieved with one of: + +echo always >/sys/kernel/mm/transparent_hugepage/enabled +echo madvise >/sys/kernel/mm/transparent_hugepage/enabled +echo never >/sys/kernel/mm/transparent_hugepage/enabled + +It's also possible to limit defrag efforts in the VM to generate +hugepages in case they're not immediately free to madvise regions or +to never try to defrag memory and simply fallback to regular pages +unless hugepages are immediately available. Clearly if we spend CPU +time to defrag memory, we would expect to gain even more by the fact +we use hugepages later instead of regular pages. This isn't always +guaranteed, but it may be more likely in case the allocation is for a +MADV_HUGEPAGE region. + +echo always >/sys/kernel/mm/transparent_hugepage/defrag +echo madvise >/sys/kernel/mm/transparent_hugepage/defrag +echo never >/sys/kernel/mm/transparent_hugepage/defrag + +khugepaged will be automatically started when +transparent_hugepage/enabled is set to "always" or "madvise, and it'll +be automatically shutdown if it's set to "never". + +khugepaged runs usually at low frequency so while one may not want to +invoke defrag algorithms synchronously during the page faults, it +should be worth invoking defrag at least in khugepaged. However it's +also possible to disable defrag in khugepaged: + +echo yes >/sys/kernel/mm/transparent_hugepage/khugepaged/defrag +echo no >/sys/kernel/mm/transparent_hugepage/khugepaged/defrag + +You can also control how many pages khugepaged should scan at each +pass: + +/sys/kernel/mm/transparent_hugepage/khugepaged/pages_to_scan + +and how many milliseconds to wait in khugepaged between each pass (you +can set this to 0 to run khugepaged at 100% utilization of one core): + +/sys/kernel/mm/transparent_hugepage/khugepaged/scan_sleep_millisecs + +and how many milliseconds to wait in khugepaged if there's an hugepage +allocation failure to throttle the next allocation attempt. + +/sys/kernel/mm/transparent_hugepage/khugepaged/alloc_sleep_millisecs + +The khugepaged progress can be seen in the number of pages collapsed: + +/sys/kernel/mm/transparent_hugepage/khugepaged/pages_collapsed + +for each pass: + +/sys/kernel/mm/transparent_hugepage/khugepaged/full_scans + +== Boot parameter == + +You can change the sysfs boot time defaults of Transparent Hugepage +Support by passing the parameter "transparent_hugepage=always" or +"transparent_hugepage=madvise" or "transparent_hugepage=never" +(without "") to the kernel command line. + +== Need of application restart == + +The transparent_hugepage/enabled values only affect future +behavior. So to make them effective you need to restart any +application that could have been using hugepages. This also applies to +the regions registered in khugepaged. + +== get_user_pages and follow_page == + +get_user_pages and follow_page if run on a hugepage, will return the +head or tail pages as usual (exactly as they would do on +hugetlbfs). Most gup users will only care about the actual physical +address of the page and its temporary pinning to release after the I/O +is complete, so they won't ever notice the fact the page is huge. But +if any driver is going to mangle over the page structure of the tail +page (like for checking page->mapping or other bits that are relevant +for the head page and not the tail page), it should be updated to jump +to check head page instead (while serializing properly against +split_huge_page() to avoid the head and tail pages to disappear from +under it, see the futex code to see an example of that, hugetlbfs also +needed special handling in futex code for similar reasons). + +NOTE: these aren't new constraints to the GUP API, and they match the +same constrains that applies to hugetlbfs too, so any driver capable +of handling GUP on hugetlbfs will also work fine on transparent +hugepage backed mappings. + +In case you can't handle compound pages if they're returned by +follow_page, the FOLL_SPLIT bit can be specified as parameter to +follow_page, so that it will split the hugepages before returning +them. Migration for example passes FOLL_SPLIT as parameter to +follow_page because it's not hugepage aware and in fact it can't work +at all on hugetlbfs (but it instead works fine on transparent +hugepages thanks to FOLL_SPLIT). migration simply can't deal with +hugepages being returned (as it's not only checking the pfn of the +page and pinning it during the copy but it pretends to migrate the +memory in regular page sizes and with regular pte/pmd mappings). + +== Optimizing the applications == + +To be guaranteed that the kernel will map a 2M page immediately in any +memory region, the mmap region has to be hugepage naturally +aligned. posix_memalign() can provide that guarantee. + +== Hugetlbfs == + +You can use hugetlbfs on a kernel that has transparent hugepage +support enabled just fine as always. No difference can be noted in +hugetlbfs other than there will be less overall fragmentation. All +usual features belonging to hugetlbfs are preserved and +unaffected. libhugetlbfs will also work fine as usual. + +== Graceful fallback == + +Code walking pagetables but unware about huge pmds can simply call +split_huge_page_pmd(mm, pmd) where the pmd is the one returned by +pmd_offset. It's trivial to make the code transparent hugepage aware +by just grepping for "pmd_offset" and adding split_huge_page_pmd where +missing after pmd_offset returns the pmd. Thanks to the graceful +fallback design, with a one liner change, you can avoid to write +hundred if not thousand of lines of complex code to make your code +hugepage aware. + +If you're not walking pagetables but you run into a physical hugepage +but you can't handle it natively in your code, you can split it by +calling split_huge_page(page). This is what the Linux VM does before +it tries to swapout the hugepage for example. + +Example to make mremap.c transparent hugepage aware with a one liner +change: + +diff --git a/mm/mremap.c b/mm/mremap.c +--- a/mm/mremap.c ++++ b/mm/mremap.c +@@ -41,6 +41,7 @@ static pmd_t *get_old_pmd(struct mm_stru + return NULL; + + pmd = pmd_offset(pud, addr); ++ split_huge_page_pmd(mm, pmd); + if (pmd_none_or_clear_bad(pmd)) + return NULL; + +== Locking in hugepage aware code == + +We want as much code as possible hugepage aware, as calling +split_huge_page() or split_huge_page_pmd() has a cost. + +To make pagetable walks huge pmd aware, all you need to do is to call +pmd_trans_huge() on the pmd returned by pmd_offset. You must hold the +mmap_sem in read (or write) mode to be sure an huge pmd cannot be +created from under you by khugepaged (khugepaged collapse_huge_page +takes the mmap_sem in write mode in addition to the anon_vma lock). If +pmd_trans_huge returns false, you just fallback in the old code +paths. If instead pmd_trans_huge returns true, you have to take the +mm->page_table_lock and re-run pmd_trans_huge. Taking the +page_table_lock will prevent the huge pmd to be converted into a +regular pmd from under you (split_huge_page can run in parallel to the +pagetable walk). If the second pmd_trans_huge returns false, you +should just drop the page_table_lock and fallback to the old code as +before. Otherwise you should run pmd_trans_splitting on the pmd. In +case pmd_trans_splitting returns true, it means split_huge_page is +already in the middle of splitting the page. So if pmd_trans_splitting +returns true it's enough to drop the page_table_lock and call +wait_split_huge_page and then fallback the old code paths. You are +guaranteed by the time wait_split_huge_page returns, the pmd isn't +huge anymore. If pmd_trans_splitting returns false, you can proceed to +process the huge pmd and the hugepage natively. Once finished you can +drop the page_table_lock. + +== compound_lock, get_user_pages and put_page == + +split_huge_page internally has to distribute the refcounts in the head +page to the tail pages before clearing all PG_head/tail bits from the +page structures. It can do that easily for refcounts taken by huge pmd +mappings. But the GUI API as created by hugetlbfs (that returns head +and tail pages if running get_user_pages on an address backed by any +hugepage), requires the refcount to be accounted on the tail pages and +not only in the head pages, if we want to be able to run +split_huge_page while there are gup pins established on any tail +page. Failure to be able to run split_huge_page if there's any gup pin +on any tail page, would mean having to split all hugepages upfront in +get_user_pages which is unacceptable as too many gup users are +performance critical and they must work natively on hugepages like +they work natively on hugetlbfs already (hugetlbfs is simpler because +hugetlbfs pages cannot be splitted so there wouldn't be requirement of +accounting the pins on the tail pages for hugetlbfs). If we wouldn't +account the gup refcounts on the tail pages during gup, we won't know +anymore which tail page is pinned by gup and which is not while we run +split_huge_page. But we still have to add the gup pin to the head page +too, to know when we can free the compound page in case it's never +splitted during its lifetime. That requires changing not just +get_page, but put_page as well so that when put_page runs on a tail +page (and only on a tail page) it will find its respective head page, +and then it will decrease the head page refcount in addition to the +tail page refcount. To obtain a head page reliably and to decrease its +refcount without race conditions, put_page has to serialize against +__split_huge_page_refcount using a special per-page lock called +compound_lock. -- cgit v1.2.3 From ece72400c2a27a3d726cb0854449f991d9fcd2da Mon Sep 17 00:00:00 2001 From: Greg Thelen Date: Thu, 13 Jan 2011 15:47:36 -0800 Subject: memcg: document cgroup dirty memory interfaces Document cgroup dirty memory interfaces and statistics. [akpm@linux-foundation.org: fix use_hierarchy description] Signed-off-by: Andrea Righi Signed-off-by: Greg Thelen Cc: KAMEZAWA Hiroyuki Cc: Daisuke Nishimura Cc: Balbir Singh Cc: Minchan Kim Cc: Wu Fengguang Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/cgroups/memory.txt | 74 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 74 insertions(+) (limited to 'Documentation') diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt index 7781857dc940..bac328c232f5 100644 --- a/Documentation/cgroups/memory.txt +++ b/Documentation/cgroups/memory.txt @@ -385,6 +385,10 @@ mapped_file - # of bytes of mapped file (includes tmpfs/shmem) pgpgin - # of pages paged in (equivalent to # of charging events). pgpgout - # of pages paged out (equivalent to # of uncharging events). swap - # of bytes of swap usage +dirty - # of bytes that are waiting to get written back to the disk. +writeback - # of bytes that are actively being written back to the disk. +nfs_unstable - # of bytes sent to the NFS server, but not yet committed to + the actual storage. inactive_anon - # of bytes of anonymous memory and swap cache memory on LRU list. active_anon - # of bytes of anonymous and swap cache memory on active @@ -406,6 +410,9 @@ total_mapped_file - sum of all children's "cache" total_pgpgin - sum of all children's "pgpgin" total_pgpgout - sum of all children's "pgpgout" total_swap - sum of all children's "swap" +total_dirty - sum of all children's "dirty" +total_writeback - sum of all children's "writeback" +total_nfs_unstable - sum of all children's "nfs_unstable" total_inactive_anon - sum of all children's "inactive_anon" total_active_anon - sum of all children's "active_anon" total_inactive_file - sum of all children's "inactive_file" @@ -453,6 +460,73 @@ memory under it will be reclaimed. You can reset failcnt by writing 0 to failcnt file. # echo 0 > .../memory.failcnt +5.5 dirty memory + +Control the maximum amount of dirty pages a cgroup can have at any given time. + +Limiting dirty memory is like fixing the max amount of dirty (hard to reclaim) +page cache used by a cgroup. So, in case of multiple cgroup writers, they will +not be able to consume more than their designated share of dirty pages and will +be forced to perform write-out if they cross that limit. + +The interface is equivalent to the procfs interface: /proc/sys/vm/dirty_*. It +is possible to configure a limit to trigger both a direct writeback or a +background writeback performed by per-bdi flusher threads. The root cgroup +memory.dirty_* control files are read-only and match the contents of +the /proc/sys/vm/dirty_* files. + +Per-cgroup dirty limits can be set using the following files in the cgroupfs: + +- memory.dirty_ratio: the amount of dirty memory (expressed as a percentage of + cgroup memory) at which a process generating dirty pages will itself start + writing out dirty data. + +- memory.dirty_limit_in_bytes: the amount of dirty memory (expressed in bytes) + in the cgroup at which a process generating dirty pages will start itself + writing out dirty data. Suffix (k, K, m, M, g, or G) can be used to indicate + that value is kilo, mega or gigabytes. + + Note: memory.dirty_limit_in_bytes is the counterpart of memory.dirty_ratio. + Only one of them may be specified at a time. When one is written it is + immediately taken into account to evaluate the dirty memory limits and the + other appears as 0 when read. + +- memory.dirty_background_ratio: the amount of dirty memory of the cgroup + (expressed as a percentage of cgroup memory) at which background writeback + kernel threads will start writing out dirty data. + +- memory.dirty_background_limit_in_bytes: the amount of dirty memory (expressed + in bytes) in the cgroup at which background writeback kernel threads will + start writing out dirty data. Suffix (k, K, m, M, g, or G) can be used to + indicate that value is kilo, mega or gigabytes. + + Note: memory.dirty_background_limit_in_bytes is the counterpart of + memory.dirty_background_ratio. Only one of them may be specified at a time. + When one is written it is immediately taken into account to evaluate the dirty + memory limits and the other appears as 0 when read. + +A cgroup may contain more dirty memory than its dirty limit. This is possible +because of the principle that the first cgroup to touch a page is charged for +it. Subsequent page counting events (dirty, writeback, nfs_unstable) are also +counted to the originally charged cgroup. + +Example: If page is allocated by a cgroup A task, then the page is charged to +cgroup A. If the page is later dirtied by a task in cgroup B, then the cgroup A +dirty count will be incremented. If cgroup A is over its dirty limit but cgroup +B is not, then dirtying a cgroup A page from a cgroup B task may push cgroup A +over its dirty limit without throttling the dirtying cgroup B task. + +When use_hierarchy=0, each cgroup has dirty memory usage and limits. +System-wide dirty limits are also consulted. Dirty memory consumption is +checked against both system-wide and per-cgroup dirty limits. + +The current implementation does not enforce per-cgroup dirty limits when +use_hierarchy=1. System-wide dirty limits are used for processes in such +cgroups. Attempts to read memory.dirty_* files return the system-wide +values. Writes to the memory.dirty_* files return error. An enhanced +implementation is needed to check the chain of parents to ensure that no +dirty limit is exceeded. + 6. Hierarchy support The memory controller supports a deep hierarchy and hierarchical accounting. -- cgit v1.2.3 From a82416da83722944d78d933301e32e7c5ad70edd Mon Sep 17 00:00:00 2001 From: Nick Piggin Date: Fri, 14 Jan 2011 02:26:53 +0000 Subject: fs: small rcu-walk documentation fixes Signed-off-by: Nick Piggin --- Documentation/filesystems/porting | 8 ++++---- Documentation/filesystems/vfs.txt | 4 ++-- 2 files changed, 6 insertions(+), 6 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting index 07a32b42cf9c..7fcac6393cd0 100644 --- a/Documentation/filesystems/porting +++ b/Documentation/filesystems/porting @@ -365,8 +365,8 @@ must be done in the RCU callback. [recommended] vfs now tries to do path walking in "rcu-walk mode", which avoids atomic operations and scalability hazards on dentries and inodes (see -Documentation/filesystems/path-walk.txt). d_hash and d_compare changes (above) -are examples of the changes required to support this. For more complex +Documentation/filesystems/path-lookup.txt). d_hash and d_compare changes +(above) are examples of the changes required to support this. For more complex filesystem callbacks, the vfs drops out of rcu-walk mode before the fs call, so no changes are required to the filesystem. However, this is costly and loses the benefits of rcu-walk mode. We will begin to add filesystem callbacks that @@ -383,5 +383,5 @@ Documentation/filesystems/vfs.txt for more details. permission and check_acl are inode permission checks that are called on many or all directory inodes on the way down a path walk (to check for -exec permission). These must now be rcu-walk aware (flags & IPERM_RCU). See -Documentation/filesystems/vfs.txt for more details. +exec permission). These must now be rcu-walk aware (flags & IPERM_FLAG_RCU). +See Documentation/filesystems/vfs.txt for more details. diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt index fbb324e2bd43..cae6d27c9f5b 100644 --- a/Documentation/filesystems/vfs.txt +++ b/Documentation/filesystems/vfs.txt @@ -415,8 +415,8 @@ otherwise noted. permission: called by the VFS to check for access rights on a POSIX-like filesystem. - May be called in rcu-walk mode (flags & IPERM_RCU). If in rcu-walk - mode, the filesystem must check the permission without blocking or + May be called in rcu-walk mode (flags & IPERM_FLAG_RCU). If in rcu-walk + mode, the filesystem must check the permission without blocking or storing to the inode. If a situation is encountered that rcu-walk cannot handle, return -- cgit v1.2.3 From 11ff26c884ec957bed87b49e3a38908f37d0d3e4 Mon Sep 17 00:00:00 2001 From: KAMEZAWA Hiroyuki Date: Fri, 14 Jan 2011 15:04:21 +0900 Subject: revert documentaion update for memcg's dirty ratio. Subjct: Revert memory cgroup dirty_ratio Documentation. The commit ece72400c2a27a3d726cb0854449f991d9fcd2da adds documentation for memcg's dirty ratio. But the function is not implemented yet. Remove the documentation for avoiding confusing users. Signed-off-by: KAMEZAWA Hiroyuki Reviewed-by: Greg Thelen Signed-off-by: Linus Torvalds --- Documentation/cgroups/memory.txt | 74 ---------------------------------------- 1 file changed, 74 deletions(-) (limited to 'Documentation') diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt index bac328c232f5..7781857dc940 100644 --- a/Documentation/cgroups/memory.txt +++ b/Documentation/cgroups/memory.txt @@ -385,10 +385,6 @@ mapped_file - # of bytes of mapped file (includes tmpfs/shmem) pgpgin - # of pages paged in (equivalent to # of charging events). pgpgout - # of pages paged out (equivalent to # of uncharging events). swap - # of bytes of swap usage -dirty - # of bytes that are waiting to get written back to the disk. -writeback - # of bytes that are actively being written back to the disk. -nfs_unstable - # of bytes sent to the NFS server, but not yet committed to - the actual storage. inactive_anon - # of bytes of anonymous memory and swap cache memory on LRU list. active_anon - # of bytes of anonymous and swap cache memory on active @@ -410,9 +406,6 @@ total_mapped_file - sum of all children's "cache" total_pgpgin - sum of all children's "pgpgin" total_pgpgout - sum of all children's "pgpgout" total_swap - sum of all children's "swap" -total_dirty - sum of all children's "dirty" -total_writeback - sum of all children's "writeback" -total_nfs_unstable - sum of all children's "nfs_unstable" total_inactive_anon - sum of all children's "inactive_anon" total_active_anon - sum of all children's "active_anon" total_inactive_file - sum of all children's "inactive_file" @@ -460,73 +453,6 @@ memory under it will be reclaimed. You can reset failcnt by writing 0 to failcnt file. # echo 0 > .../memory.failcnt -5.5 dirty memory - -Control the maximum amount of dirty pages a cgroup can have at any given time. - -Limiting dirty memory is like fixing the max amount of dirty (hard to reclaim) -page cache used by a cgroup. So, in case of multiple cgroup writers, they will -not be able to consume more than their designated share of dirty pages and will -be forced to perform write-out if they cross that limit. - -The interface is equivalent to the procfs interface: /proc/sys/vm/dirty_*. It -is possible to configure a limit to trigger both a direct writeback or a -background writeback performed by per-bdi flusher threads. The root cgroup -memory.dirty_* control files are read-only and match the contents of -the /proc/sys/vm/dirty_* files. - -Per-cgroup dirty limits can be set using the following files in the cgroupfs: - -- memory.dirty_ratio: the amount of dirty memory (expressed as a percentage of - cgroup memory) at which a process generating dirty pages will itself start - writing out dirty data. - -- memory.dirty_limit_in_bytes: the amount of dirty memory (expressed in bytes) - in the cgroup at which a process generating dirty pages will start itself - writing out dirty data. Suffix (k, K, m, M, g, or G) can be used to indicate - that value is kilo, mega or gigabytes. - - Note: memory.dirty_limit_in_bytes is the counterpart of memory.dirty_ratio. - Only one of them may be specified at a time. When one is written it is - immediately taken into account to evaluate the dirty memory limits and the - other appears as 0 when read. - -- memory.dirty_background_ratio: the amount of dirty memory of the cgroup - (expressed as a percentage of cgroup memory) at which background writeback - kernel threads will start writing out dirty data. - -- memory.dirty_background_limit_in_bytes: the amount of dirty memory (expressed - in bytes) in the cgroup at which background writeback kernel threads will - start writing out dirty data. Suffix (k, K, m, M, g, or G) can be used to - indicate that value is kilo, mega or gigabytes. - - Note: memory.dirty_background_limit_in_bytes is the counterpart of - memory.dirty_background_ratio. Only one of them may be specified at a time. - When one is written it is immediately taken into account to evaluate the dirty - memory limits and the other appears as 0 when read. - -A cgroup may contain more dirty memory than its dirty limit. This is possible -because of the principle that the first cgroup to touch a page is charged for -it. Subsequent page counting events (dirty, writeback, nfs_unstable) are also -counted to the originally charged cgroup. - -Example: If page is allocated by a cgroup A task, then the page is charged to -cgroup A. If the page is later dirtied by a task in cgroup B, then the cgroup A -dirty count will be incremented. If cgroup A is over its dirty limit but cgroup -B is not, then dirtying a cgroup A page from a cgroup B task may push cgroup A -over its dirty limit without throttling the dirtying cgroup B task. - -When use_hierarchy=0, each cgroup has dirty memory usage and limits. -System-wide dirty limits are also consulted. Dirty memory consumption is -checked against both system-wide and per-cgroup dirty limits. - -The current implementation does not enforce per-cgroup dirty limits when -use_hierarchy=1. System-wide dirty limits are used for processes in such -cgroups. Attempts to read memory.dirty_* files return the system-wide -values. Writes to the memory.dirty_* files return error. An enhanced -implementation is needed to check the chain of parents to ensure that no -dirty limit is exceeded. - 6. Hierarchy support The memory controller supports a deep hierarchy and hierarchical accounting. -- cgit v1.2.3 From c66ac9db8d4ad9994a02b3e933ea2ccc643e1fe5 Mon Sep 17 00:00:00 2001 From: Nicholas Bellinger Date: Fri, 17 Dec 2010 11:11:26 -0800 Subject: [SCSI] target: Add LIO target core v4.0.0-rc6 LIO target is a full featured in-kernel target framework with the following feature set: High-performance, non-blocking, multithreaded architecture with SIMD support. Advanced SCSI feature set: * Persistent Reservations (PRs) * Asymmetric Logical Unit Assignment (ALUA) * Protocol and intra-nexus multiplexing, load-balancing and failover (MC/S) * Full Error Recovery (ERL=0,1,2) * Active/active task migration and session continuation (ERL=2) * Thin LUN provisioning (UNMAP and WRITE_SAMExx) Multiprotocol target plugins Storage media independence: * Virtualization of all storage media; transparent mapping of IO to LUNs * No hard limits on number of LUNs per Target; maximum LUN size ~750 TB * Backstores: SATA, SAS, SCSI, BluRay, DVD, FLASH, USB, ramdisk, etc. Standards compliance: * Full compliance with IETF (RFC 3720) * Full implementation of SPC-4 PRs and ALUA Significant code cleanups done by Christoph Hellwig. [jejb: fix up for new block bdev exclusive interface. Minor fixes from Randy Dunlap and Dan Carpenter.] Signed-off-by: Nicholas A. Bellinger Signed-off-by: James Bottomley --- Documentation/target/tcm_mod_builder.py | 1094 +++++ Documentation/target/tcm_mod_builder.txt | 145 + drivers/Kconfig | 2 + drivers/Makefile | 1 + drivers/target/Kconfig | 32 + drivers/target/Makefile | 24 + drivers/target/target_core_alua.c | 1991 +++++++++ drivers/target/target_core_alua.h | 126 + drivers/target/target_core_cdb.c | 1131 +++++ drivers/target/target_core_configfs.c | 3225 ++++++++++++++ drivers/target/target_core_device.c | 1694 +++++++ drivers/target/target_core_fabric_configfs.c | 996 +++++ drivers/target/target_core_fabric_lib.c | 451 ++ drivers/target/target_core_file.c | 688 +++ drivers/target/target_core_file.h | 50 + drivers/target/target_core_hba.c | 185 + drivers/target/target_core_hba.h | 7 + drivers/target/target_core_iblock.c | 808 ++++ drivers/target/target_core_iblock.h | 40 + drivers/target/target_core_mib.c | 1078 +++++ drivers/target/target_core_mib.h | 28 + drivers/target/target_core_pr.c | 4252 ++++++++++++++++++ drivers/target/target_core_pr.h | 67 + drivers/target/target_core_pscsi.c | 1470 ++++++ drivers/target/target_core_pscsi.h | 65 + drivers/target/target_core_rd.c | 1091 +++++ drivers/target/target_core_rd.h | 73 + drivers/target/target_core_scdb.c | 105 + drivers/target/target_core_scdb.h | 10 + drivers/target/target_core_tmr.c | 404 ++ drivers/target/target_core_tpg.c | 826 ++++ drivers/target/target_core_transport.c | 6134 ++++++++++++++++++++++++++ drivers/target/target_core_ua.c | 332 ++ drivers/target/target_core_ua.h | 36 + include/target/configfs_macros.h | 147 + include/target/target_core_base.h | 937 ++++ include/target/target_core_configfs.h | 52 + include/target/target_core_device.h | 61 + include/target/target_core_fabric_configfs.h | 106 + include/target/target_core_fabric_lib.h | 28 + include/target/target_core_fabric_ops.h | 100 + include/target/target_core_tmr.h | 43 + include/target/target_core_tpg.h | 35 + include/target/target_core_transport.h | 351 ++ 44 files changed, 30521 insertions(+) create mode 100755 Documentation/target/tcm_mod_builder.py create mode 100644 Documentation/target/tcm_mod_builder.txt create mode 100644 drivers/target/Kconfig create mode 100644 drivers/target/Makefile create mode 100644 drivers/target/target_core_alua.c create mode 100644 drivers/target/target_core_alua.h create mode 100644 drivers/target/target_core_cdb.c create mode 100644 drivers/target/target_core_configfs.c create mode 100644 drivers/target/target_core_device.c create mode 100644 drivers/target/target_core_fabric_configfs.c create mode 100644 drivers/target/target_core_fabric_lib.c create mode 100644 drivers/target/target_core_file.c create mode 100644 drivers/target/target_core_file.h create mode 100644 drivers/target/target_core_hba.c create mode 100644 drivers/target/target_core_hba.h create mode 100644 drivers/target/target_core_iblock.c create mode 100644 drivers/target/target_core_iblock.h create mode 100644 drivers/target/target_core_mib.c create mode 100644 drivers/target/target_core_mib.h create mode 100644 drivers/target/target_core_pr.c create mode 100644 drivers/target/target_core_pr.h create mode 100644 drivers/target/target_core_pscsi.c create mode 100644 drivers/target/target_core_pscsi.h create mode 100644 drivers/target/target_core_rd.c create mode 100644 drivers/target/target_core_rd.h create mode 100644 drivers/target/target_core_scdb.c create mode 100644 drivers/target/target_core_scdb.h create mode 100644 drivers/target/target_core_tmr.c create mode 100644 drivers/target/target_core_tpg.c create mode 100644 drivers/target/target_core_transport.c create mode 100644 drivers/target/target_core_ua.c create mode 100644 drivers/target/target_core_ua.h create mode 100644 include/target/configfs_macros.h create mode 100644 include/target/target_core_base.h create mode 100644 include/target/target_core_configfs.h create mode 100644 include/target/target_core_device.h create mode 100644 include/target/target_core_fabric_configfs.h create mode 100644 include/target/target_core_fabric_lib.h create mode 100644 include/target/target_core_fabric_ops.h create mode 100644 include/target/target_core_tmr.h create mode 100644 include/target/target_core_tpg.h create mode 100644 include/target/target_core_transport.h (limited to 'Documentation') diff --git a/Documentation/target/tcm_mod_builder.py b/Documentation/target/tcm_mod_builder.py new file mode 100755 index 000000000000..dbeb8a0d7175 --- /dev/null +++ b/Documentation/target/tcm_mod_builder.py @@ -0,0 +1,1094 @@ +#!/usr/bin/python +# The TCM v4 multi-protocol fabric module generation script for drivers/target/$NEW_MOD +# +# Copyright (c) 2010 Rising Tide Systems +# Copyright (c) 2010 Linux-iSCSI.org +# +# Author: nab@kernel.org +# +import os, sys +import subprocess as sub +import string +import re +import optparse + +tcm_dir = "" + +fabric_ops = [] +fabric_mod_dir = "" +fabric_mod_port = "" +fabric_mod_init_port = "" + +def tcm_mod_err(msg): + print msg + sys.exit(1) + +def tcm_mod_create_module_subdir(fabric_mod_dir_var): + + if os.path.isdir(fabric_mod_dir_var) == True: + return 1 + + print "Creating fabric_mod_dir: " + fabric_mod_dir_var + ret = os.mkdir(fabric_mod_dir_var) + if ret: + tcm_mod_err("Unable to mkdir " + fabric_mod_dir_var) + + return + +def tcm_mod_build_FC_include(fabric_mod_dir_var, fabric_mod_name): + global fabric_mod_port + global fabric_mod_init_port + buf = "" + + f = fabric_mod_dir_var + "/" + fabric_mod_name + "_base.h" + print "Writing file: " + f + + p = open(f, 'w'); + if not p: + tcm_mod_err("Unable to open file: " + f) + + buf = "#define " + fabric_mod_name.upper() + "_VERSION \"v0.1\"\n" + buf += "#define " + fabric_mod_name.upper() + "_NAMELEN 32\n" + buf += "\n" + buf += "struct " + fabric_mod_name + "_nacl {\n" + buf += " /* Binary World Wide unique Port Name for FC Initiator Nport */\n" + buf += " u64 nport_wwpn;\n" + buf += " /* ASCII formatted WWPN for FC Initiator Nport */\n" + buf += " char nport_name[" + fabric_mod_name.upper() + "_NAMELEN];\n" + buf += " /* Returned by " + fabric_mod_name + "_make_nodeacl() */\n" + buf += " struct se_node_acl se_node_acl;\n" + buf += "};\n" + buf += "\n" + buf += "struct " + fabric_mod_name + "_tpg {\n" + buf += " /* FC lport target portal group tag for TCM */\n" + buf += " u16 lport_tpgt;\n" + buf += " /* Pointer back to " + fabric_mod_name + "_lport */\n" + buf += " struct " + fabric_mod_name + "_lport *lport;\n" + buf += " /* Returned by " + fabric_mod_name + "_make_tpg() */\n" + buf += " struct se_portal_group se_tpg;\n" + buf += "};\n" + buf += "\n" + buf += "struct " + fabric_mod_name + "_lport {\n" + buf += " /* SCSI protocol the lport is providing */\n" + buf += " u8 lport_proto_id;\n" + buf += " /* Binary World Wide unique Port Name for FC Target Lport */\n" + buf += " u64 lport_wwpn;\n" + buf += " /* ASCII formatted WWPN for FC Target Lport */\n" + buf += " char lport_name[" + fabric_mod_name.upper() + "_NAMELEN];\n" + buf += " /* Returned by " + fabric_mod_name + "_make_lport() */\n" + buf += " struct se_wwn lport_wwn;\n" + buf += "};\n" + + ret = p.write(buf) + if ret: + tcm_mod_err("Unable to write f: " + f) + + p.close() + + fabric_mod_port = "lport" + fabric_mod_init_port = "nport" + + return + +def tcm_mod_build_SAS_include(fabric_mod_dir_var, fabric_mod_name): + global fabric_mod_port + global fabric_mod_init_port + buf = "" + + f = fabric_mod_dir_var + "/" + fabric_mod_name + "_base.h" + print "Writing file: " + f + + p = open(f, 'w'); + if not p: + tcm_mod_err("Unable to open file: " + f) + + buf = "#define " + fabric_mod_name.upper() + "_VERSION \"v0.1\"\n" + buf += "#define " + fabric_mod_name.upper() + "_NAMELEN 32\n" + buf += "\n" + buf += "struct " + fabric_mod_name + "_nacl {\n" + buf += " /* Binary World Wide unique Port Name for SAS Initiator port */\n" + buf += " u64 iport_wwpn;\n" + buf += " /* ASCII formatted WWPN for Sas Initiator port */\n" + buf += " char iport_name[" + fabric_mod_name.upper() + "_NAMELEN];\n" + buf += " /* Returned by " + fabric_mod_name + "_make_nodeacl() */\n" + buf += " struct se_node_acl se_node_acl;\n" + buf += "};\n\n" + buf += "struct " + fabric_mod_name + "_tpg {\n" + buf += " /* SAS port target portal group tag for TCM */\n" + buf += " u16 tport_tpgt;\n" + buf += " /* Pointer back to " + fabric_mod_name + "_tport */\n" + buf += " struct " + fabric_mod_name + "_tport *tport;\n" + buf += " /* Returned by " + fabric_mod_name + "_make_tpg() */\n" + buf += " struct se_portal_group se_tpg;\n" + buf += "};\n\n" + buf += "struct " + fabric_mod_name + "_tport {\n" + buf += " /* SCSI protocol the tport is providing */\n" + buf += " u8 tport_proto_id;\n" + buf += " /* Binary World Wide unique Port Name for SAS Target port */\n" + buf += " u64 tport_wwpn;\n" + buf += " /* ASCII formatted WWPN for SAS Target port */\n" + buf += " char tport_name[" + fabric_mod_name.upper() + "_NAMELEN];\n" + buf += " /* Returned by " + fabric_mod_name + "_make_tport() */\n" + buf += " struct se_wwn tport_wwn;\n" + buf += "};\n" + + ret = p.write(buf) + if ret: + tcm_mod_err("Unable to write f: " + f) + + p.close() + + fabric_mod_port = "tport" + fabric_mod_init_port = "iport" + + return + +def tcm_mod_build_iSCSI_include(fabric_mod_dir_var, fabric_mod_name): + global fabric_mod_port + global fabric_mod_init_port + buf = "" + + f = fabric_mod_dir_var + "/" + fabric_mod_name + "_base.h" + print "Writing file: " + f + + p = open(f, 'w'); + if not p: + tcm_mod_err("Unable to open file: " + f) + + buf = "#define " + fabric_mod_name.upper() + "_VERSION \"v0.1\"\n" + buf += "#define " + fabric_mod_name.upper() + "_NAMELEN 32\n" + buf += "\n" + buf += "struct " + fabric_mod_name + "_nacl {\n" + buf += " /* ASCII formatted InitiatorName */\n" + buf += " char iport_name[" + fabric_mod_name.upper() + "_NAMELEN];\n" + buf += " /* Returned by " + fabric_mod_name + "_make_nodeacl() */\n" + buf += " struct se_node_acl se_node_acl;\n" + buf += "};\n\n" + buf += "struct " + fabric_mod_name + "_tpg {\n" + buf += " /* iSCSI target portal group tag for TCM */\n" + buf += " u16 tport_tpgt;\n" + buf += " /* Pointer back to " + fabric_mod_name + "_tport */\n" + buf += " struct " + fabric_mod_name + "_tport *tport;\n" + buf += " /* Returned by " + fabric_mod_name + "_make_tpg() */\n" + buf += " struct se_portal_group se_tpg;\n" + buf += "};\n\n" + buf += "struct " + fabric_mod_name + "_tport {\n" + buf += " /* SCSI protocol the tport is providing */\n" + buf += " u8 tport_proto_id;\n" + buf += " /* ASCII formatted TargetName for IQN */\n" + buf += " char tport_name[" + fabric_mod_name.upper() + "_NAMELEN];\n" + buf += " /* Returned by " + fabric_mod_name + "_make_tport() */\n" + buf += " struct se_wwn tport_wwn;\n" + buf += "};\n" + + ret = p.write(buf) + if ret: + tcm_mod_err("Unable to write f: " + f) + + p.close() + + fabric_mod_port = "tport" + fabric_mod_init_port = "iport" + + return + +def tcm_mod_build_base_includes(proto_ident, fabric_mod_dir_val, fabric_mod_name): + + if proto_ident == "FC": + tcm_mod_build_FC_include(fabric_mod_dir_val, fabric_mod_name) + elif proto_ident == "SAS": + tcm_mod_build_SAS_include(fabric_mod_dir_val, fabric_mod_name) + elif proto_ident == "iSCSI": + tcm_mod_build_iSCSI_include(fabric_mod_dir_val, fabric_mod_name) + else: + print "Unsupported proto_ident: " + proto_ident + sys.exit(1) + + return + +def tcm_mod_build_configfs(proto_ident, fabric_mod_dir_var, fabric_mod_name): + buf = "" + + f = fabric_mod_dir_var + "/" + fabric_mod_name + "_configfs.c" + print "Writing file: " + f + + p = open(f, 'w'); + if not p: + tcm_mod_err("Unable to open file: " + f) + + buf = "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n\n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n\n" + buf += "#include <" + fabric_mod_name + "_base.h>\n" + buf += "#include <" + fabric_mod_name + "_fabric.h>\n\n" + + buf += "/* Local pointer to allocated TCM configfs fabric module */\n" + buf += "struct target_fabric_configfs *" + fabric_mod_name + "_fabric_configfs;\n\n" + + buf += "static struct se_node_acl *" + fabric_mod_name + "_make_nodeacl(\n" + buf += " struct se_portal_group *se_tpg,\n" + buf += " struct config_group *group,\n" + buf += " const char *name)\n" + buf += "{\n" + buf += " struct se_node_acl *se_nacl, *se_nacl_new;\n" + buf += " struct " + fabric_mod_name + "_nacl *nacl;\n" + + if proto_ident == "FC" or proto_ident == "SAS": + buf += " u64 wwpn = 0;\n" + + buf += " u32 nexus_depth;\n\n" + buf += " /* " + fabric_mod_name + "_parse_wwn(name, &wwpn, 1) < 0)\n" + buf += " return ERR_PTR(-EINVAL); */\n" + buf += " se_nacl_new = " + fabric_mod_name + "_alloc_fabric_acl(se_tpg);\n" + buf += " if (!(se_nacl_new))\n" + buf += " return ERR_PTR(-ENOMEM);\n" + buf += "//#warning FIXME: Hardcoded nexus depth in " + fabric_mod_name + "_make_nodeacl()\n" + buf += " nexus_depth = 1;\n" + buf += " /*\n" + buf += " * se_nacl_new may be released by core_tpg_add_initiator_node_acl()\n" + buf += " * when converting a NodeACL from demo mode -> explict\n" + buf += " */\n" + buf += " se_nacl = core_tpg_add_initiator_node_acl(se_tpg, se_nacl_new,\n" + buf += " name, nexus_depth);\n" + buf += " if (IS_ERR(se_nacl)) {\n" + buf += " " + fabric_mod_name + "_release_fabric_acl(se_tpg, se_nacl_new);\n" + buf += " return se_nacl;\n" + buf += " }\n" + buf += " /*\n" + buf += " * Locate our struct " + fabric_mod_name + "_nacl and set the FC Nport WWPN\n" + buf += " */\n" + buf += " nacl = container_of(se_nacl, struct " + fabric_mod_name + "_nacl, se_node_acl);\n" + + if proto_ident == "FC" or proto_ident == "SAS": + buf += " nacl->" + fabric_mod_init_port + "_wwpn = wwpn;\n" + + buf += " /* " + fabric_mod_name + "_format_wwn(&nacl->" + fabric_mod_init_port + "_name[0], " + fabric_mod_name.upper() + "_NAMELEN, wwpn); */\n\n" + buf += " return se_nacl;\n" + buf += "}\n\n" + buf += "static void " + fabric_mod_name + "_drop_nodeacl(struct se_node_acl *se_acl)\n" + buf += "{\n" + buf += " struct " + fabric_mod_name + "_nacl *nacl = container_of(se_acl,\n" + buf += " struct " + fabric_mod_name + "_nacl, se_node_acl);\n" + buf += " kfree(nacl);\n" + buf += "}\n\n" + + buf += "static struct se_portal_group *" + fabric_mod_name + "_make_tpg(\n" + buf += " struct se_wwn *wwn,\n" + buf += " struct config_group *group,\n" + buf += " const char *name)\n" + buf += "{\n" + buf += " struct " + fabric_mod_name + "_" + fabric_mod_port + "*" + fabric_mod_port + " = container_of(wwn,\n" + buf += " struct " + fabric_mod_name + "_" + fabric_mod_port + ", " + fabric_mod_port + "_wwn);\n\n" + buf += " struct " + fabric_mod_name + "_tpg *tpg;\n" + buf += " unsigned long tpgt;\n" + buf += " int ret;\n\n" + buf += " if (strstr(name, \"tpgt_\") != name)\n" + buf += " return ERR_PTR(-EINVAL);\n" + buf += " if (strict_strtoul(name + 5, 10, &tpgt) || tpgt > UINT_MAX)\n" + buf += " return ERR_PTR(-EINVAL);\n\n" + buf += " tpg = kzalloc(sizeof(struct " + fabric_mod_name + "_tpg), GFP_KERNEL);\n" + buf += " if (!(tpg)) {\n" + buf += " printk(KERN_ERR \"Unable to allocate struct " + fabric_mod_name + "_tpg\");\n" + buf += " return ERR_PTR(-ENOMEM);\n" + buf += " }\n" + buf += " tpg->" + fabric_mod_port + " = " + fabric_mod_port + ";\n" + buf += " tpg->" + fabric_mod_port + "_tpgt = tpgt;\n\n" + buf += " ret = core_tpg_register(&" + fabric_mod_name + "_fabric_configfs->tf_ops, wwn,\n" + buf += " &tpg->se_tpg, (void *)tpg,\n" + buf += " TRANSPORT_TPG_TYPE_NORMAL);\n" + buf += " if (ret < 0) {\n" + buf += " kfree(tpg);\n" + buf += " return NULL;\n" + buf += " }\n" + buf += " return &tpg->se_tpg;\n" + buf += "}\n\n" + buf += "static void " + fabric_mod_name + "_drop_tpg(struct se_portal_group *se_tpg)\n" + buf += "{\n" + buf += " struct " + fabric_mod_name + "_tpg *tpg = container_of(se_tpg,\n" + buf += " struct " + fabric_mod_name + "_tpg, se_tpg);\n\n" + buf += " core_tpg_deregister(se_tpg);\n" + buf += " kfree(tpg);\n" + buf += "}\n\n" + + buf += "static struct se_wwn *" + fabric_mod_name + "_make_" + fabric_mod_port + "(\n" + buf += " struct target_fabric_configfs *tf,\n" + buf += " struct config_group *group,\n" + buf += " const char *name)\n" + buf += "{\n" + buf += " struct " + fabric_mod_name + "_" + fabric_mod_port + " *" + fabric_mod_port + ";\n" + + if proto_ident == "FC" or proto_ident == "SAS": + buf += " u64 wwpn = 0;\n\n" + + buf += " /* if (" + fabric_mod_name + "_parse_wwn(name, &wwpn, 1) < 0)\n" + buf += " return ERR_PTR(-EINVAL); */\n\n" + buf += " " + fabric_mod_port + " = kzalloc(sizeof(struct " + fabric_mod_name + "_" + fabric_mod_port + "), GFP_KERNEL);\n" + buf += " if (!(" + fabric_mod_port + ")) {\n" + buf += " printk(KERN_ERR \"Unable to allocate struct " + fabric_mod_name + "_" + fabric_mod_port + "\");\n" + buf += " return ERR_PTR(-ENOMEM);\n" + buf += " }\n" + + if proto_ident == "FC" or proto_ident == "SAS": + buf += " " + fabric_mod_port + "->" + fabric_mod_port + "_wwpn = wwpn;\n" + + buf += " /* " + fabric_mod_name + "_format_wwn(&" + fabric_mod_port + "->" + fabric_mod_port + "_name[0], " + fabric_mod_name.upper() + "__NAMELEN, wwpn); */\n\n" + buf += " return &" + fabric_mod_port + "->" + fabric_mod_port + "_wwn;\n" + buf += "}\n\n" + buf += "static void " + fabric_mod_name + "_drop_" + fabric_mod_port + "(struct se_wwn *wwn)\n" + buf += "{\n" + buf += " struct " + fabric_mod_name + "_" + fabric_mod_port + " *" + fabric_mod_port + " = container_of(wwn,\n" + buf += " struct " + fabric_mod_name + "_" + fabric_mod_port + ", " + fabric_mod_port + "_wwn);\n" + buf += " kfree(" + fabric_mod_port + ");\n" + buf += "}\n\n" + buf += "static ssize_t " + fabric_mod_name + "_wwn_show_attr_version(\n" + buf += " struct target_fabric_configfs *tf,\n" + buf += " char *page)\n" + buf += "{\n" + buf += " return sprintf(page, \"" + fabric_mod_name.upper() + " fabric module %s on %s/%s\"\n" + buf += " \"on \"UTS_RELEASE\"\\n\", " + fabric_mod_name.upper() + "_VERSION, utsname()->sysname,\n" + buf += " utsname()->machine);\n" + buf += "}\n\n" + buf += "TF_WWN_ATTR_RO(" + fabric_mod_name + ", version);\n\n" + buf += "static struct configfs_attribute *" + fabric_mod_name + "_wwn_attrs[] = {\n" + buf += " &" + fabric_mod_name + "_wwn_version.attr,\n" + buf += " NULL,\n" + buf += "};\n\n" + + buf += "static struct target_core_fabric_ops " + fabric_mod_name + "_ops = {\n" + buf += " .get_fabric_name = " + fabric_mod_name + "_get_fabric_name,\n" + buf += " .get_fabric_proto_ident = " + fabric_mod_name + "_get_fabric_proto_ident,\n" + buf += " .tpg_get_wwn = " + fabric_mod_name + "_get_fabric_wwn,\n" + buf += " .tpg_get_tag = " + fabric_mod_name + "_get_tag,\n" + buf += " .tpg_get_default_depth = " + fabric_mod_name + "_get_default_depth,\n" + buf += " .tpg_get_pr_transport_id = " + fabric_mod_name + "_get_pr_transport_id,\n" + buf += " .tpg_get_pr_transport_id_len = " + fabric_mod_name + "_get_pr_transport_id_len,\n" + buf += " .tpg_parse_pr_out_transport_id = " + fabric_mod_name + "_parse_pr_out_transport_id,\n" + buf += " .tpg_check_demo_mode = " + fabric_mod_name + "_check_false,\n" + buf += " .tpg_check_demo_mode_cache = " + fabric_mod_name + "_check_true,\n" + buf += " .tpg_check_demo_mode_write_protect = " + fabric_mod_name + "_check_true,\n" + buf += " .tpg_check_prod_mode_write_protect = " + fabric_mod_name + "_check_false,\n" + buf += " .tpg_alloc_fabric_acl = " + fabric_mod_name + "_alloc_fabric_acl,\n" + buf += " .tpg_release_fabric_acl = " + fabric_mod_name + "_release_fabric_acl,\n" + buf += " .tpg_get_inst_index = " + fabric_mod_name + "_tpg_get_inst_index,\n" + buf += " .release_cmd_to_pool = " + fabric_mod_name + "_release_cmd,\n" + buf += " .release_cmd_direct = " + fabric_mod_name + "_release_cmd,\n" + buf += " .shutdown_session = " + fabric_mod_name + "_shutdown_session,\n" + buf += " .close_session = " + fabric_mod_name + "_close_session,\n" + buf += " .stop_session = " + fabric_mod_name + "_stop_session,\n" + buf += " .fall_back_to_erl0 = " + fabric_mod_name + "_reset_nexus,\n" + buf += " .sess_logged_in = " + fabric_mod_name + "_sess_logged_in,\n" + buf += " .sess_get_index = " + fabric_mod_name + "_sess_get_index,\n" + buf += " .sess_get_initiator_sid = NULL,\n" + buf += " .write_pending = " + fabric_mod_name + "_write_pending,\n" + buf += " .write_pending_status = " + fabric_mod_name + "_write_pending_status,\n" + buf += " .set_default_node_attributes = " + fabric_mod_name + "_set_default_node_attrs,\n" + buf += " .get_task_tag = " + fabric_mod_name + "_get_task_tag,\n" + buf += " .get_cmd_state = " + fabric_mod_name + "_get_cmd_state,\n" + buf += " .new_cmd_failure = " + fabric_mod_name + "_new_cmd_failure,\n" + buf += " .queue_data_in = " + fabric_mod_name + "_queue_data_in,\n" + buf += " .queue_status = " + fabric_mod_name + "_queue_status,\n" + buf += " .queue_tm_rsp = " + fabric_mod_name + "_queue_tm_rsp,\n" + buf += " .get_fabric_sense_len = " + fabric_mod_name + "_get_fabric_sense_len,\n" + buf += " .set_fabric_sense_len = " + fabric_mod_name + "_set_fabric_sense_len,\n" + buf += " .is_state_remove = " + fabric_mod_name + "_is_state_remove,\n" + buf += " .pack_lun = " + fabric_mod_name + "_pack_lun,\n" + buf += " /*\n" + buf += " * Setup function pointers for generic logic in target_core_fabric_configfs.c\n" + buf += " */\n" + buf += " .fabric_make_wwn = " + fabric_mod_name + "_make_" + fabric_mod_port + ",\n" + buf += " .fabric_drop_wwn = " + fabric_mod_name + "_drop_" + fabric_mod_port + ",\n" + buf += " .fabric_make_tpg = " + fabric_mod_name + "_make_tpg,\n" + buf += " .fabric_drop_tpg = " + fabric_mod_name + "_drop_tpg,\n" + buf += " .fabric_post_link = NULL,\n" + buf += " .fabric_pre_unlink = NULL,\n" + buf += " .fabric_make_np = NULL,\n" + buf += " .fabric_drop_np = NULL,\n" + buf += " .fabric_make_nodeacl = " + fabric_mod_name + "_make_nodeacl,\n" + buf += " .fabric_drop_nodeacl = " + fabric_mod_name + "_drop_nodeacl,\n" + buf += "};\n\n" + + buf += "static int " + fabric_mod_name + "_register_configfs(void)\n" + buf += "{\n" + buf += " struct target_fabric_configfs *fabric;\n" + buf += " int ret;\n\n" + buf += " printk(KERN_INFO \"" + fabric_mod_name.upper() + " fabric module %s on %s/%s\"\n" + buf += " \" on \"UTS_RELEASE\"\\n\"," + fabric_mod_name.upper() + "_VERSION, utsname()->sysname,\n" + buf += " utsname()->machine);\n" + buf += " /*\n" + buf += " * Register the top level struct config_item_type with TCM core\n" + buf += " */\n" + buf += " fabric = target_fabric_configfs_init(THIS_MODULE, \"" + fabric_mod_name[4:] + "\");\n" + buf += " if (!(fabric)) {\n" + buf += " printk(KERN_ERR \"target_fabric_configfs_init() failed\\n\");\n" + buf += " return -ENOMEM;\n" + buf += " }\n" + buf += " /*\n" + buf += " * Setup fabric->tf_ops from our local " + fabric_mod_name + "_ops\n" + buf += " */\n" + buf += " fabric->tf_ops = " + fabric_mod_name + "_ops;\n" + buf += " /*\n" + buf += " * Setup default attribute lists for various fabric->tf_cit_tmpl\n" + buf += " */\n" + buf += " TF_CIT_TMPL(fabric)->tfc_wwn_cit.ct_attrs = " + fabric_mod_name + "_wwn_attrs;\n" + buf += " TF_CIT_TMPL(fabric)->tfc_tpg_base_cit.ct_attrs = NULL;\n" + buf += " TF_CIT_TMPL(fabric)->tfc_tpg_attrib_cit.ct_attrs = NULL;\n" + buf += " TF_CIT_TMPL(fabric)->tfc_tpg_param_cit.ct_attrs = NULL;\n" + buf += " TF_CIT_TMPL(fabric)->tfc_tpg_np_base_cit.ct_attrs = NULL;\n" + buf += " TF_CIT_TMPL(fabric)->tfc_tpg_nacl_base_cit.ct_attrs = NULL;\n" + buf += " TF_CIT_TMPL(fabric)->tfc_tpg_nacl_attrib_cit.ct_attrs = NULL;\n" + buf += " TF_CIT_TMPL(fabric)->tfc_tpg_nacl_auth_cit.ct_attrs = NULL;\n" + buf += " TF_CIT_TMPL(fabric)->tfc_tpg_nacl_param_cit.ct_attrs = NULL;\n" + buf += " /*\n" + buf += " * Register the fabric for use within TCM\n" + buf += " */\n" + buf += " ret = target_fabric_configfs_register(fabric);\n" + buf += " if (ret < 0) {\n" + buf += " printk(KERN_ERR \"target_fabric_configfs_register() failed\"\n" + buf += " \" for " + fabric_mod_name.upper() + "\\n\");\n" + buf += " return ret;\n" + buf += " }\n" + buf += " /*\n" + buf += " * Setup our local pointer to *fabric\n" + buf += " */\n" + buf += " " + fabric_mod_name + "_fabric_configfs = fabric;\n" + buf += " printk(KERN_INFO \"" + fabric_mod_name.upper() + "[0] - Set fabric -> " + fabric_mod_name + "_fabric_configfs\\n\");\n" + buf += " return 0;\n" + buf += "};\n\n" + buf += "static void " + fabric_mod_name + "_deregister_configfs(void)\n" + buf += "{\n" + buf += " if (!(" + fabric_mod_name + "_fabric_configfs))\n" + buf += " return;\n\n" + buf += " target_fabric_configfs_deregister(" + fabric_mod_name + "_fabric_configfs);\n" + buf += " " + fabric_mod_name + "_fabric_configfs = NULL;\n" + buf += " printk(KERN_INFO \"" + fabric_mod_name.upper() + "[0] - Cleared " + fabric_mod_name + "_fabric_configfs\\n\");\n" + buf += "};\n\n" + + buf += "static int __init " + fabric_mod_name + "_init(void)\n" + buf += "{\n" + buf += " int ret;\n\n" + buf += " ret = " + fabric_mod_name + "_register_configfs();\n" + buf += " if (ret < 0)\n" + buf += " return ret;\n\n" + buf += " return 0;\n" + buf += "};\n\n" + buf += "static void " + fabric_mod_name + "_exit(void)\n" + buf += "{\n" + buf += " " + fabric_mod_name + "_deregister_configfs();\n" + buf += "};\n\n" + + buf += "#ifdef MODULE\n" + buf += "MODULE_DESCRIPTION(\"" + fabric_mod_name.upper() + " series fabric driver\");\n" + buf += "MODULE_LICENSE(\"GPL\");\n" + buf += "module_init(" + fabric_mod_name + "_init);\n" + buf += "module_exit(" + fabric_mod_name + "_exit);\n" + buf += "#endif\n" + + ret = p.write(buf) + if ret: + tcm_mod_err("Unable to write f: " + f) + + p.close() + + return + +def tcm_mod_scan_fabric_ops(tcm_dir): + + fabric_ops_api = tcm_dir + "include/target/target_core_fabric_ops.h" + + print "Using tcm_mod_scan_fabric_ops: " + fabric_ops_api + process_fo = 0; + + p = open(fabric_ops_api, 'r') + + line = p.readline() + while line: + if process_fo == 0 and re.search('struct target_core_fabric_ops {', line): + line = p.readline() + continue + + if process_fo == 0: + process_fo = 1; + line = p.readline() + # Search for function pointer + if not re.search('\(\*', line): + continue + + fabric_ops.append(line.rstrip()) + continue + + line = p.readline() + # Search for function pointer + if not re.search('\(\*', line): + continue + + fabric_ops.append(line.rstrip()) + + p.close() + return + +def tcm_mod_dump_fabric_ops(proto_ident, fabric_mod_dir_var, fabric_mod_name): + buf = "" + bufi = "" + + f = fabric_mod_dir_var + "/" + fabric_mod_name + "_fabric.c" + print "Writing file: " + f + + p = open(f, 'w') + if not p: + tcm_mod_err("Unable to open file: " + f) + + fi = fabric_mod_dir_var + "/" + fabric_mod_name + "_fabric.h" + print "Writing file: " + fi + + pi = open(fi, 'w') + if not pi: + tcm_mod_err("Unable to open file: " + fi) + + buf = "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n\n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include \n" + buf += "#include <" + fabric_mod_name + "_base.h>\n" + buf += "#include <" + fabric_mod_name + "_fabric.h>\n\n" + + buf += "int " + fabric_mod_name + "_check_true(struct se_portal_group *se_tpg)\n" + buf += "{\n" + buf += " return 1;\n" + buf += "}\n\n" + bufi += "int " + fabric_mod_name + "_check_true(struct se_portal_group *);\n" + + buf += "int " + fabric_mod_name + "_check_false(struct se_portal_group *se_tpg)\n" + buf += "{\n" + buf += " return 0;\n" + buf += "}\n\n" + bufi += "int " + fabric_mod_name + "_check_false(struct se_portal_group *);\n" + + total_fabric_ops = len(fabric_ops) + i = 0 + + while i < total_fabric_ops: + fo = fabric_ops[i] + i += 1 +# print "fabric_ops: " + fo + + if re.search('get_fabric_name', fo): + buf += "char *" + fabric_mod_name + "_get_fabric_name(void)\n" + buf += "{\n" + buf += " return \"" + fabric_mod_name[4:] + "\";\n" + buf += "}\n\n" + bufi += "char *" + fabric_mod_name + "_get_fabric_name(void);\n" + continue + + if re.search('get_fabric_proto_ident', fo): + buf += "u8 " + fabric_mod_name + "_get_fabric_proto_ident(struct se_portal_group *se_tpg)\n" + buf += "{\n" + buf += " struct " + fabric_mod_name + "_tpg *tpg = container_of(se_tpg,\n" + buf += " struct " + fabric_mod_name + "_tpg, se_tpg);\n" + buf += " struct " + fabric_mod_name + "_" + fabric_mod_port + " *" + fabric_mod_port + " = tpg->" + fabric_mod_port + ";\n" + buf += " u8 proto_id;\n\n" + buf += " switch (" + fabric_mod_port + "->" + fabric_mod_port + "_proto_id) {\n" + if proto_ident == "FC": + buf += " case SCSI_PROTOCOL_FCP:\n" + buf += " default:\n" + buf += " proto_id = fc_get_fabric_proto_ident(se_tpg);\n" + buf += " break;\n" + elif proto_ident == "SAS": + buf += " case SCSI_PROTOCOL_SAS:\n" + buf += " default:\n" + buf += " proto_id = sas_get_fabric_proto_ident(se_tpg);\n" + buf += " break;\n" + elif proto_ident == "iSCSI": + buf += " case SCSI_PROTOCOL_ISCSI:\n" + buf += " default:\n" + buf += " proto_id = iscsi_get_fabric_proto_ident(se_tpg);\n" + buf += " break;\n" + + buf += " }\n\n" + buf += " return proto_id;\n" + buf += "}\n\n" + bufi += "u8 " + fabric_mod_name + "_get_fabric_proto_ident(struct se_portal_group *);\n" + + if re.search('get_wwn', fo): + buf += "char *" + fabric_mod_name + "_get_fabric_wwn(struct se_portal_group *se_tpg)\n" + buf += "{\n" + buf += " struct " + fabric_mod_name + "_tpg *tpg = container_of(se_tpg,\n" + buf += " struct " + fabric_mod_name + "_tpg, se_tpg);\n" + buf += " struct " + fabric_mod_name + "_" + fabric_mod_port + " *" + fabric_mod_port + " = tpg->" + fabric_mod_port + ";\n\n" + buf += " return &" + fabric_mod_port + "->" + fabric_mod_port + "_name[0];\n" + buf += "}\n\n" + bufi += "char *" + fabric_mod_name + "_get_fabric_wwn(struct se_portal_group *);\n" + + if re.search('get_tag', fo): + buf += "u16 " + fabric_mod_name + "_get_tag(struct se_portal_group *se_tpg)\n" + buf += "{\n" + buf += " struct " + fabric_mod_name + "_tpg *tpg = container_of(se_tpg,\n" + buf += " struct " + fabric_mod_name + "_tpg, se_tpg);\n" + buf += " return tpg->" + fabric_mod_port + "_tpgt;\n" + buf += "}\n\n" + bufi += "u16 " + fabric_mod_name + "_get_tag(struct se_portal_group *);\n" + + if re.search('get_default_depth', fo): + buf += "u32 " + fabric_mod_name + "_get_default_depth(struct se_portal_group *se_tpg)\n" + buf += "{\n" + buf += " return 1;\n" + buf += "}\n\n" + bufi += "u32 " + fabric_mod_name + "_get_default_depth(struct se_portal_group *);\n" + + if re.search('get_pr_transport_id\)\(', fo): + buf += "u32 " + fabric_mod_name + "_get_pr_transport_id(\n" + buf += " struct se_portal_group *se_tpg,\n" + buf += " struct se_node_acl *se_nacl,\n" + buf += " struct t10_pr_registration *pr_reg,\n" + buf += " int *format_code,\n" + buf += " unsigned char *buf)\n" + buf += "{\n" + buf += " struct " + fabric_mod_name + "_tpg *tpg = container_of(se_tpg,\n" + buf += " struct " + fabric_mod_name + "_tpg, se_tpg);\n" + buf += " struct " + fabric_mod_name + "_" + fabric_mod_port + " *" + fabric_mod_port + " = tpg->" + fabric_mod_port + ";\n" + buf += " int ret = 0;\n\n" + buf += " switch (" + fabric_mod_port + "->" + fabric_mod_port + "_proto_id) {\n" + if proto_ident == "FC": + buf += " case SCSI_PROTOCOL_FCP:\n" + buf += " default:\n" + buf += " ret = fc_get_pr_transport_id(se_tpg, se_nacl, pr_reg,\n" + buf += " format_code, buf);\n" + buf += " break;\n" + elif proto_ident == "SAS": + buf += " case SCSI_PROTOCOL_SAS:\n" + buf += " default:\n" + buf += " ret = sas_get_pr_transport_id(se_tpg, se_nacl, pr_reg,\n" + buf += " format_code, buf);\n" + buf += " break;\n" + elif proto_ident == "iSCSI": + buf += " case SCSI_PROTOCOL_ISCSI:\n" + buf += " default:\n" + buf += " ret = iscsi_get_pr_transport_id(se_tpg, se_nacl, pr_reg,\n" + buf += " format_code, buf);\n" + buf += " break;\n" + + buf += " }\n\n" + buf += " return ret;\n" + buf += "}\n\n" + bufi += "u32 " + fabric_mod_name + "_get_pr_transport_id(struct se_portal_group *,\n" + bufi += " struct se_node_acl *, struct t10_pr_registration *,\n" + bufi += " int *, unsigned char *);\n" + + if re.search('get_pr_transport_id_len\)\(', fo): + buf += "u32 " + fabric_mod_name + "_get_pr_transport_id_len(\n" + buf += " struct se_portal_group *se_tpg,\n" + buf += " struct se_node_acl *se_nacl,\n" + buf += " struct t10_pr_registration *pr_reg,\n" + buf += " int *format_code)\n" + buf += "{\n" + buf += " struct " + fabric_mod_name + "_tpg *tpg = container_of(se_tpg,\n" + buf += " struct " + fabric_mod_name + "_tpg, se_tpg);\n" + buf += " struct " + fabric_mod_name + "_" + fabric_mod_port + " *" + fabric_mod_port + " = tpg->" + fabric_mod_port + ";\n" + buf += " int ret = 0;\n\n" + buf += " switch (" + fabric_mod_port + "->" + fabric_mod_port + "_proto_id) {\n" + if proto_ident == "FC": + buf += " case SCSI_PROTOCOL_FCP:\n" + buf += " default:\n" + buf += " ret = fc_get_pr_transport_id_len(se_tpg, se_nacl, pr_reg,\n" + buf += " format_code);\n" + buf += " break;\n" + elif proto_ident == "SAS": + buf += " case SCSI_PROTOCOL_SAS:\n" + buf += " default:\n" + buf += " ret = sas_get_pr_transport_id_len(se_tpg, se_nacl, pr_reg,\n" + buf += " format_code);\n" + buf += " break;\n" + elif proto_ident == "iSCSI": + buf += " case SCSI_PROTOCOL_ISCSI:\n" + buf += " default:\n" + buf += " ret = iscsi_get_pr_transport_id_len(se_tpg, se_nacl, pr_reg,\n" + buf += " format_code);\n" + buf += " break;\n" + + + buf += " }\n\n" + buf += " return ret;\n" + buf += "}\n\n" + bufi += "u32 " + fabric_mod_name + "_get_pr_transport_id_len(struct se_portal_group *,\n" + bufi += " struct se_node_acl *, struct t10_pr_registration *,\n" + bufi += " int *);\n" + + if re.search('parse_pr_out_transport_id\)\(', fo): + buf += "char *" + fabric_mod_name + "_parse_pr_out_transport_id(\n" + buf += " struct se_portal_group *se_tpg,\n" + buf += " const char *buf,\n" + buf += " u32 *out_tid_len,\n" + buf += " char **port_nexus_ptr)\n" + buf += "{\n" + buf += " struct " + fabric_mod_name + "_tpg *tpg = container_of(se_tpg,\n" + buf += " struct " + fabric_mod_name + "_tpg, se_tpg);\n" + buf += " struct " + fabric_mod_name + "_" + fabric_mod_port + " *" + fabric_mod_port + " = tpg->" + fabric_mod_port + ";\n" + buf += " char *tid = NULL;\n\n" + buf += " switch (" + fabric_mod_port + "->" + fabric_mod_port + "_proto_id) {\n" + if proto_ident == "FC": + buf += " case SCSI_PROTOCOL_FCP:\n" + buf += " default:\n" + buf += " tid = fc_parse_pr_out_transport_id(se_tpg, buf, out_tid_len,\n" + buf += " port_nexus_ptr);\n" + elif proto_ident == "SAS": + buf += " case SCSI_PROTOCOL_SAS:\n" + buf += " default:\n" + buf += " tid = sas_parse_pr_out_transport_id(se_tpg, buf, out_tid_len,\n" + buf += " port_nexus_ptr);\n" + elif proto_ident == "iSCSI": + buf += " case SCSI_PROTOCOL_ISCSI:\n" + buf += " default:\n" + buf += " tid = iscsi_parse_pr_out_transport_id(se_tpg, buf, out_tid_len,\n" + buf += " port_nexus_ptr);\n" + + buf += " }\n\n" + buf += " return tid;\n" + buf += "}\n\n" + bufi += "char *" + fabric_mod_name + "_parse_pr_out_transport_id(struct se_portal_group *,\n" + bufi += " const char *, u32 *, char **);\n" + + if re.search('alloc_fabric_acl\)\(', fo): + buf += "struct se_node_acl *" + fabric_mod_name + "_alloc_fabric_acl(struct se_portal_group *se_tpg)\n" + buf += "{\n" + buf += " struct " + fabric_mod_name + "_nacl *nacl;\n\n" + buf += " nacl = kzalloc(sizeof(struct " + fabric_mod_name + "_nacl), GFP_KERNEL);\n" + buf += " if (!(nacl)) {\n" + buf += " printk(KERN_ERR \"Unable to alocate struct " + fabric_mod_name + "_nacl\\n\");\n" + buf += " return NULL;\n" + buf += " }\n\n" + buf += " return &nacl->se_node_acl;\n" + buf += "}\n\n" + bufi += "struct se_node_acl *" + fabric_mod_name + "_alloc_fabric_acl(struct se_portal_group *);\n" + + if re.search('release_fabric_acl\)\(', fo): + buf += "void " + fabric_mod_name + "_release_fabric_acl(\n" + buf += " struct se_portal_group *se_tpg,\n" + buf += " struct se_node_acl *se_nacl)\n" + buf += "{\n" + buf += " struct " + fabric_mod_name + "_nacl *nacl = container_of(se_nacl,\n" + buf += " struct " + fabric_mod_name + "_nacl, se_node_acl);\n" + buf += " kfree(nacl);\n" + buf += "}\n\n" + bufi += "void " + fabric_mod_name + "_release_fabric_acl(struct se_portal_group *,\n" + bufi += " struct se_node_acl *);\n" + + if re.search('tpg_get_inst_index\)\(', fo): + buf += "u32 " + fabric_mod_name + "_tpg_get_inst_index(struct se_portal_group *se_tpg)\n" + buf += "{\n" + buf += " return 1;\n" + buf += "}\n\n" + bufi += "u32 " + fabric_mod_name + "_tpg_get_inst_index(struct se_portal_group *);\n" + + if re.search('release_cmd_to_pool', fo): + buf += "void " + fabric_mod_name + "_release_cmd(struct se_cmd *se_cmd)\n" + buf += "{\n" + buf += " return;\n" + buf += "}\n\n" + bufi += "void " + fabric_mod_name + "_release_cmd(struct se_cmd *);\n" + + if re.search('shutdown_session\)\(', fo): + buf += "int " + fabric_mod_name + "_shutdown_session(struct se_session *se_sess)\n" + buf += "{\n" + buf += " return 0;\n" + buf += "}\n\n" + bufi += "int " + fabric_mod_name + "_shutdown_session(struct se_session *);\n" + + if re.search('close_session\)\(', fo): + buf += "void " + fabric_mod_name + "_close_session(struct se_session *se_sess)\n" + buf += "{\n" + buf += " return;\n" + buf += "}\n\n" + bufi += "void " + fabric_mod_name + "_close_session(struct se_session *);\n" + + if re.search('stop_session\)\(', fo): + buf += "void " + fabric_mod_name + "_stop_session(struct se_session *se_sess, int sess_sleep , int conn_sleep)\n" + buf += "{\n" + buf += " return;\n" + buf += "}\n\n" + bufi += "void " + fabric_mod_name + "_stop_session(struct se_session *, int, int);\n" + + if re.search('fall_back_to_erl0\)\(', fo): + buf += "void " + fabric_mod_name + "_reset_nexus(struct se_session *se_sess)\n" + buf += "{\n" + buf += " return;\n" + buf += "}\n\n" + bufi += "void " + fabric_mod_name + "_reset_nexus(struct se_session *);\n" + + if re.search('sess_logged_in\)\(', fo): + buf += "int " + fabric_mod_name + "_sess_logged_in(struct se_session *se_sess)\n" + buf += "{\n" + buf += " return 0;\n" + buf += "}\n\n" + bufi += "int " + fabric_mod_name + "_sess_logged_in(struct se_session *);\n" + + if re.search('sess_get_index\)\(', fo): + buf += "u32 " + fabric_mod_name + "_sess_get_index(struct se_session *se_sess)\n" + buf += "{\n" + buf += " return 0;\n" + buf += "}\n\n" + bufi += "u32 " + fabric_mod_name + "_sess_get_index(struct se_session *);\n" + + if re.search('write_pending\)\(', fo): + buf += "int " + fabric_mod_name + "_write_pending(struct se_cmd *se_cmd)\n" + buf += "{\n" + buf += " return 0;\n" + buf += "}\n\n" + bufi += "int " + fabric_mod_name + "_write_pending(struct se_cmd *);\n" + + if re.search('write_pending_status\)\(', fo): + buf += "int " + fabric_mod_name + "_write_pending_status(struct se_cmd *se_cmd)\n" + buf += "{\n" + buf += " return 0;\n" + buf += "}\n\n" + bufi += "int " + fabric_mod_name + "_write_pending_status(struct se_cmd *);\n" + + if re.search('set_default_node_attributes\)\(', fo): + buf += "void " + fabric_mod_name + "_set_default_node_attrs(struct se_node_acl *nacl)\n" + buf += "{\n" + buf += " return;\n" + buf += "}\n\n" + bufi += "void " + fabric_mod_name + "_set_default_node_attrs(struct se_node_acl *);\n" + + if re.search('get_task_tag\)\(', fo): + buf += "u32 " + fabric_mod_name + "_get_task_tag(struct se_cmd *se_cmd)\n" + buf += "{\n" + buf += " return 0;\n" + buf += "}\n\n" + bufi += "u32 " + fabric_mod_name + "_get_task_tag(struct se_cmd *);\n" + + if re.search('get_cmd_state\)\(', fo): + buf += "int " + fabric_mod_name + "_get_cmd_state(struct se_cmd *se_cmd)\n" + buf += "{\n" + buf += " return 0;\n" + buf += "}\n\n" + bufi += "int " + fabric_mod_name + "_get_cmd_state(struct se_cmd *);\n" + + if re.search('new_cmd_failure\)\(', fo): + buf += "void " + fabric_mod_name + "_new_cmd_failure(struct se_cmd *se_cmd)\n" + buf += "{\n" + buf += " return;\n" + buf += "}\n\n" + bufi += "void " + fabric_mod_name + "_new_cmd_failure(struct se_cmd *);\n" + + if re.search('queue_data_in\)\(', fo): + buf += "int " + fabric_mod_name + "_queue_data_in(struct se_cmd *se_cmd)\n" + buf += "{\n" + buf += " return 0;\n" + buf += "}\n\n" + bufi += "int " + fabric_mod_name + "_queue_data_in(struct se_cmd *);\n" + + if re.search('queue_status\)\(', fo): + buf += "int " + fabric_mod_name + "_queue_status(struct se_cmd *se_cmd)\n" + buf += "{\n" + buf += " return 0;\n" + buf += "}\n\n" + bufi += "int " + fabric_mod_name + "_queue_status(struct se_cmd *);\n" + + if re.search('queue_tm_rsp\)\(', fo): + buf += "int " + fabric_mod_name + "_queue_tm_rsp(struct se_cmd *se_cmd)\n" + buf += "{\n" + buf += " return 0;\n" + buf += "}\n\n" + bufi += "int " + fabric_mod_name + "_queue_tm_rsp(struct se_cmd *);\n" + + if re.search('get_fabric_sense_len\)\(', fo): + buf += "u16 " + fabric_mod_name + "_get_fabric_sense_len(void)\n" + buf += "{\n" + buf += " return 0;\n" + buf += "}\n\n" + bufi += "u16 " + fabric_mod_name + "_get_fabric_sense_len(void);\n" + + if re.search('set_fabric_sense_len\)\(', fo): + buf += "u16 " + fabric_mod_name + "_set_fabric_sense_len(struct se_cmd *se_cmd, u32 sense_length)\n" + buf += "{\n" + buf += " return 0;\n" + buf += "}\n\n" + bufi += "u16 " + fabric_mod_name + "_set_fabric_sense_len(struct se_cmd *, u32);\n" + + if re.search('is_state_remove\)\(', fo): + buf += "int " + fabric_mod_name + "_is_state_remove(struct se_cmd *se_cmd)\n" + buf += "{\n" + buf += " return 0;\n" + buf += "}\n\n" + bufi += "int " + fabric_mod_name + "_is_state_remove(struct se_cmd *);\n" + + if re.search('pack_lun\)\(', fo): + buf += "u64 " + fabric_mod_name + "_pack_lun(unsigned int lun)\n" + buf += "{\n" + buf += " WARN_ON(lun >= 256);\n" + buf += " /* Caller wants this byte-swapped */\n" + buf += " return cpu_to_le64((lun & 0xff) << 8);\n" + buf += "}\n\n" + bufi += "u64 " + fabric_mod_name + "_pack_lun(unsigned int);\n" + + + ret = p.write(buf) + if ret: + tcm_mod_err("Unable to write f: " + f) + + p.close() + + ret = pi.write(bufi) + if ret: + tcm_mod_err("Unable to write fi: " + fi) + + pi.close() + return + +def tcm_mod_build_kbuild(fabric_mod_dir_var, fabric_mod_name): + + buf = "" + f = fabric_mod_dir_var + "/Kbuild" + print "Writing file: " + f + + p = open(f, 'w') + if not p: + tcm_mod_err("Unable to open file: " + f) + + buf = "EXTRA_CFLAGS += -I$(srctree)/drivers/target/ -I$(srctree)/include/ -I$(srctree)/drivers/scsi/ -I$(srctree)/include/scsi/ -I$(srctree)/drivers/target/" + fabric_mod_name + "\n\n" + buf += fabric_mod_name + "-objs := " + fabric_mod_name + "_fabric.o \\\n" + buf += " " + fabric_mod_name + "_configfs.o\n" + buf += "obj-$(CONFIG_" + fabric_mod_name.upper() + ") += " + fabric_mod_name + ".o\n" + + ret = p.write(buf) + if ret: + tcm_mod_err("Unable to write f: " + f) + + p.close() + return + +def tcm_mod_build_kconfig(fabric_mod_dir_var, fabric_mod_name): + + buf = "" + f = fabric_mod_dir_var + "/Kconfig" + print "Writing file: " + f + + p = open(f, 'w') + if not p: + tcm_mod_err("Unable to open file: " + f) + + buf = "config " + fabric_mod_name.upper() + "\n" + buf += " tristate \"" + fabric_mod_name.upper() + " fabric module\"\n" + buf += " depends on TARGET_CORE && CONFIGFS_FS\n" + buf += " default n\n" + buf += " ---help---\n" + buf += " Say Y here to enable the " + fabric_mod_name.upper() + " fabric module\n" + + ret = p.write(buf) + if ret: + tcm_mod_err("Unable to write f: " + f) + + p.close() + return + +def tcm_mod_add_kbuild(tcm_dir, fabric_mod_name): + buf = "obj-$(CONFIG_" + fabric_mod_name.upper() + ") += " + fabric_mod_name.lower() + "/\n" + kbuild = tcm_dir + "/drivers/target/Kbuild" + + f = open(kbuild, 'a') + f.write(buf) + f.close() + return + +def tcm_mod_add_kconfig(tcm_dir, fabric_mod_name): + buf = "source \"drivers/target/" + fabric_mod_name.lower() + "/Kconfig\"\n" + kconfig = tcm_dir + "/drivers/target/Kconfig" + + f = open(kconfig, 'a') + f.write(buf) + f.close() + return + +def main(modname, proto_ident): +# proto_ident = "FC" +# proto_ident = "SAS" +# proto_ident = "iSCSI" + + tcm_dir = os.getcwd(); + tcm_dir += "/../../" + print "tcm_dir: " + tcm_dir + fabric_mod_name = modname + fabric_mod_dir = tcm_dir + "drivers/target/" + fabric_mod_name + print "Set fabric_mod_name: " + fabric_mod_name + print "Set fabric_mod_dir: " + fabric_mod_dir + print "Using proto_ident: " + proto_ident + + if proto_ident != "FC" and proto_ident != "SAS" and proto_ident != "iSCSI": + print "Unsupported proto_ident: " + proto_ident + sys.exit(1) + + ret = tcm_mod_create_module_subdir(fabric_mod_dir) + if ret: + print "tcm_mod_create_module_subdir() failed because module already exists!" + sys.exit(1) + + tcm_mod_build_base_includes(proto_ident, fabric_mod_dir, fabric_mod_name) + tcm_mod_scan_fabric_ops(tcm_dir) + tcm_mod_dump_fabric_ops(proto_ident, fabric_mod_dir, fabric_mod_name) + tcm_mod_build_configfs(proto_ident, fabric_mod_dir, fabric_mod_name) + tcm_mod_build_kbuild(fabric_mod_dir, fabric_mod_name) + tcm_mod_build_kconfig(fabric_mod_dir, fabric_mod_name) + + input = raw_input("Would you like to add " + fabric_mod_name + "to drivers/target/Kbuild..? [yes,no]: ") + if input == "yes" or input == "y": + tcm_mod_add_kbuild(tcm_dir, fabric_mod_name) + + input = raw_input("Would you like to add " + fabric_mod_name + "to drivers/target/Kconfig..? [yes,no]: ") + if input == "yes" or input == "y": + tcm_mod_add_kconfig(tcm_dir, fabric_mod_name) + + return + +parser = optparse.OptionParser() +parser.add_option('-m', '--modulename', help='Module name', dest='modname', + action='store', nargs=1, type='string') +parser.add_option('-p', '--protoident', help='Protocol Ident', dest='protoident', + action='store', nargs=1, type='string') + +(opts, args) = parser.parse_args() + +mandatories = ['modname', 'protoident'] +for m in mandatories: + if not opts.__dict__[m]: + print "mandatory option is missing\n" + parser.print_help() + exit(-1) + +if __name__ == "__main__": + + main(str(opts.modname), opts.protoident) diff --git a/Documentation/target/tcm_mod_builder.txt b/Documentation/target/tcm_mod_builder.txt new file mode 100644 index 000000000000..84533d8e747f --- /dev/null +++ b/Documentation/target/tcm_mod_builder.txt @@ -0,0 +1,145 @@ +>>>>>>>>>> The TCM v4 fabric module script generator <<<<<<<<<< + +Greetings all, + +This document is intended to be a mini-HOWTO for using the tcm_mod_builder.py +script to generate a brand new functional TCM v4 fabric .ko module of your very own, +that once built can be immediately be loaded to start access the new TCM/ConfigFS +fabric skeleton, by simply using: + + modprobe $TCM_NEW_MOD + mkdir -p /sys/kernel/config/target/$TCM_NEW_MOD + +This script will create a new drivers/target/$TCM_NEW_MOD/, and will do the following + + *) Generate new API callers for drivers/target/target_core_fabric_configs.c logic + ->make_nodeacl(), ->drop_nodeacl(), ->make_tpg(), ->drop_tpg() + ->make_wwn(), ->drop_wwn(). These are created into $TCM_NEW_MOD/$TCM_NEW_MOD_configfs.c + *) Generate basic infrastructure for loading/unloading LKMs and TCM/ConfigFS fabric module + using a skeleton struct target_core_fabric_ops API template. + *) Based on user defined T10 Proto_Ident for the new fabric module being built, + the TransportID / Initiator and Target WWPN related handlers for + SPC-3 persistent reservation are automatically generated in $TCM_NEW_MOD/$TCM_NEW_MOD_fabric.c + using drivers/target/target_core_fabric_lib.c logic. + *) NOP API calls for all other Data I/O path and fabric dependent attribute logic + in $TCM_NEW_MOD/$TCM_NEW_MOD_fabric.c + +tcm_mod_builder.py depends upon the mandatory '-p $PROTO_IDENT' and '-m +$FABRIC_MOD_name' parameters, and actually running the script looks like: + +target:/mnt/sdb/lio-core-2.6.git/Documentation/target# python tcm_mod_builder.py -p iSCSI -m tcm_nab5000 +tcm_dir: /mnt/sdb/lio-core-2.6.git/Documentation/target/../../ +Set fabric_mod_name: tcm_nab5000 +Set fabric_mod_dir: +/mnt/sdb/lio-core-2.6.git/Documentation/target/../../drivers/target/tcm_nab5000 +Using proto_ident: iSCSI +Creating fabric_mod_dir: +/mnt/sdb/lio-core-2.6.git/Documentation/target/../../drivers/target/tcm_nab5000 +Writing file: +/mnt/sdb/lio-core-2.6.git/Documentation/target/../../drivers/target/tcm_nab5000/tcm_nab5000_base.h +Using tcm_mod_scan_fabric_ops: +/mnt/sdb/lio-core-2.6.git/Documentation/target/../../include/target/target_core_fabric_ops.h +Writing file: +/mnt/sdb/lio-core-2.6.git/Documentation/target/../../drivers/target/tcm_nab5000/tcm_nab5000_fabric.c +Writing file: +/mnt/sdb/lio-core-2.6.git/Documentation/target/../../drivers/target/tcm_nab5000/tcm_nab5000_fabric.h +Writing file: +/mnt/sdb/lio-core-2.6.git/Documentation/target/../../drivers/target/tcm_nab5000/tcm_nab5000_configfs.c +Writing file: +/mnt/sdb/lio-core-2.6.git/Documentation/target/../../drivers/target/tcm_nab5000/Kbuild +Writing file: +/mnt/sdb/lio-core-2.6.git/Documentation/target/../../drivers/target/tcm_nab5000/Kconfig +Would you like to add tcm_nab5000to drivers/target/Kbuild..? [yes,no]: yes +Would you like to add tcm_nab5000to drivers/target/Kconfig..? [yes,no]: yes + +At the end of tcm_mod_builder.py. the script will ask to add the following +line to drivers/target/Kbuild: + + obj-$(CONFIG_TCM_NAB5000) += tcm_nab5000/ + +and the same for drivers/target/Kconfig: + + source "drivers/target/tcm_nab5000/Kconfig" + +*) Run 'make menuconfig' and select the new CONFIG_TCM_NAB5000 item: + + TCM_NAB5000 fabric module + +*) Build using 'make modules', once completed you will have: + +target:/mnt/sdb/lio-core-2.6.git# ls -la drivers/target/tcm_nab5000/ +total 1348 +drwxr-xr-x 2 root root 4096 2010-10-05 03:23 . +drwxr-xr-x 9 root root 4096 2010-10-05 03:22 .. +-rw-r--r-- 1 root root 282 2010-10-05 03:22 Kbuild +-rw-r--r-- 1 root root 171 2010-10-05 03:22 Kconfig +-rw-r--r-- 1 root root 49 2010-10-05 03:23 modules.order +-rw-r--r-- 1 root root 738 2010-10-05 03:22 tcm_nab5000_base.h +-rw-r--r-- 1 root root 9096 2010-10-05 03:22 tcm_nab5000_configfs.c +-rw-r--r-- 1 root root 191200 2010-10-05 03:23 tcm_nab5000_configfs.o +-rw-r--r-- 1 root root 40504 2010-10-05 03:23 .tcm_nab5000_configfs.o.cmd +-rw-r--r-- 1 root root 5414 2010-10-05 03:22 tcm_nab5000_fabric.c +-rw-r--r-- 1 root root 2016 2010-10-05 03:22 tcm_nab5000_fabric.h +-rw-r--r-- 1 root root 190932 2010-10-05 03:23 tcm_nab5000_fabric.o +-rw-r--r-- 1 root root 40713 2010-10-05 03:23 .tcm_nab5000_fabric.o.cmd +-rw-r--r-- 1 root root 401861 2010-10-05 03:23 tcm_nab5000.ko +-rw-r--r-- 1 root root 265 2010-10-05 03:23 .tcm_nab5000.ko.cmd +-rw-r--r-- 1 root root 459 2010-10-05 03:23 tcm_nab5000.mod.c +-rw-r--r-- 1 root root 23896 2010-10-05 03:23 tcm_nab5000.mod.o +-rw-r--r-- 1 root root 22655 2010-10-05 03:23 .tcm_nab5000.mod.o.cmd +-rw-r--r-- 1 root root 379022 2010-10-05 03:23 tcm_nab5000.o +-rw-r--r-- 1 root root 211 2010-10-05 03:23 .tcm_nab5000.o.cmd + +*) Load the new module, create a lun_0 configfs group, and add new TCM Core + IBLOCK backstore symlink to port: + +target:/mnt/sdb/lio-core-2.6.git# insmod drivers/target/tcm_nab5000.ko +target:/mnt/sdb/lio-core-2.6.git# mkdir -p /sys/kernel/config/target/nab5000/iqn.foo/tpgt_1/lun/lun_0 +target:/mnt/sdb/lio-core-2.6.git# cd /sys/kernel/config/target/nab5000/iqn.foo/tpgt_1/lun/lun_0/ +target:/sys/kernel/config/target/nab5000/iqn.foo/tpgt_1/lun/lun_0# ln -s /sys/kernel/config/target/core/iblock_0/lvm_test0 nab5000_port + +target:/sys/kernel/config/target/nab5000/iqn.foo/tpgt_1/lun/lun_0# cd - +target:/mnt/sdb/lio-core-2.6.git# tree /sys/kernel/config/target/nab5000/ +/sys/kernel/config/target/nab5000/ +|-- discovery_auth +|-- iqn.foo +| `-- tpgt_1 +| |-- acls +| |-- attrib +| |-- lun +| | `-- lun_0 +| | |-- alua_tg_pt_gp +| | |-- alua_tg_pt_offline +| | |-- alua_tg_pt_status +| | |-- alua_tg_pt_write_md +| | `-- nab5000_port -> ../../../../../../target/core/iblock_0/lvm_test0 +| |-- np +| `-- param +`-- version + +target:/mnt/sdb/lio-core-2.6.git# lsmod +Module Size Used by +tcm_nab5000 3935 4 +iscsi_target_mod 193211 0 +target_core_stgt 8090 0 +target_core_pscsi 11122 1 +target_core_file 9172 2 +target_core_iblock 9280 1 +target_core_mod 228575 31 +tcm_nab5000,iscsi_target_mod,target_core_stgt,target_core_pscsi,target_core_file,target_core_iblock +libfc 73681 0 +scsi_debug 56265 0 +scsi_tgt 8666 1 target_core_stgt +configfs 20644 2 target_core_mod + +---------------------------------------------------------------------- + +Future TODO items: + + *) Add more T10 proto_idents + *) Make tcm_mod_dump_fabric_ops() smarter and generate function pointer + defs directly from include/target/target_core_fabric_ops.h:struct target_core_fabric_ops + structure members. + +October 5th, 2010 +Nicholas A. Bellinger diff --git a/drivers/Kconfig b/drivers/Kconfig index dd0a5b5e9bf3..9bfb71ff3a6a 100644 --- a/drivers/Kconfig +++ b/drivers/Kconfig @@ -26,6 +26,8 @@ source "drivers/ata/Kconfig" source "drivers/md/Kconfig" +source "drivers/target/Kconfig" + source "drivers/message/fusion/Kconfig" source "drivers/firewire/Kconfig" diff --git a/drivers/Makefile b/drivers/Makefile index ef5132469f58..7eb35f479461 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -46,6 +46,7 @@ obj-y += macintosh/ obj-$(CONFIG_IDE) += ide/ obj-$(CONFIG_SCSI) += scsi/ obj-$(CONFIG_ATA) += ata/ +obj-$(CONFIG_TARGET_CORE) += target/ obj-$(CONFIG_MTD) += mtd/ obj-$(CONFIG_SPI) += spi/ obj-y += net/ diff --git a/drivers/target/Kconfig b/drivers/target/Kconfig new file mode 100644 index 000000000000..2fac3be209ac --- /dev/null +++ b/drivers/target/Kconfig @@ -0,0 +1,32 @@ + +menuconfig TARGET_CORE + tristate "Generic Target Core Mod (TCM) and ConfigFS Infrastructure" + depends on SCSI && BLOCK + select CONFIGFS_FS + default n + help + Say Y or M here to enable the TCM Storage Engine and ConfigFS enabled + control path for target_core_mod. This includes built-in TCM RAMDISK + subsystem logic for virtual LUN 0 access + +if TARGET_CORE + +config TCM_IBLOCK + tristate "TCM/IBLOCK Subsystem Plugin for Linux/BLOCK" + help + Say Y here to enable the TCM/IBLOCK subsystem plugin for non-buffered + access to Linux/Block devices using BIO + +config TCM_FILEIO + tristate "TCM/FILEIO Subsystem Plugin for Linux/VFS" + help + Say Y here to enable the TCM/FILEIO subsystem plugin for buffered + access to Linux/VFS struct file or struct block_device + +config TCM_PSCSI + tristate "TCM/pSCSI Subsystem Plugin for Linux/SCSI" + help + Say Y here to enable the TCM/pSCSI subsystem plugin for non-buffered + passthrough access to Linux/SCSI device + +endif diff --git a/drivers/target/Makefile b/drivers/target/Makefile new file mode 100644 index 000000000000..5cfd70819f08 --- /dev/null +++ b/drivers/target/Makefile @@ -0,0 +1,24 @@ +EXTRA_CFLAGS += -I$(srctree)/drivers/target/ -I$(srctree)/drivers/scsi/ + +target_core_mod-y := target_core_configfs.o \ + target_core_device.o \ + target_core_fabric_configfs.o \ + target_core_fabric_lib.o \ + target_core_hba.o \ + target_core_pr.o \ + target_core_alua.o \ + target_core_scdb.o \ + target_core_tmr.o \ + target_core_tpg.o \ + target_core_transport.o \ + target_core_cdb.o \ + target_core_ua.o \ + target_core_rd.o \ + target_core_mib.o + +obj-$(CONFIG_TARGET_CORE) += target_core_mod.o + +# Subsystem modules +obj-$(CONFIG_TCM_IBLOCK) += target_core_iblock.o +obj-$(CONFIG_TCM_FILEIO) += target_core_file.o +obj-$(CONFIG_TCM_PSCSI) += target_core_pscsi.o diff --git a/drivers/target/target_core_alua.c b/drivers/target/target_core_alua.c new file mode 100644 index 000000000000..2c5fcfed5934 --- /dev/null +++ b/drivers/target/target_core_alua.c @@ -0,0 +1,1991 @@ +/******************************************************************************* + * Filename: target_core_alua.c + * + * This file contains SPC-3 compliant asymmetric logical unit assigntment (ALUA) + * + * Copyright (c) 2009-2010 Rising Tide Systems + * Copyright (c) 2009-2010 Linux-iSCSI.org + * + * Nicholas A. Bellinger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + ******************************************************************************/ + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "target_core_alua.h" +#include "target_core_hba.h" +#include "target_core_ua.h" + +static int core_alua_check_transition(int state, int *primary); +static int core_alua_set_tg_pt_secondary_state( + struct t10_alua_tg_pt_gp_member *tg_pt_gp_mem, + struct se_port *port, int explict, int offline); + +/* + * REPORT_TARGET_PORT_GROUPS + * + * See spc4r17 section 6.27 + */ +int core_emulate_report_target_port_groups(struct se_cmd *cmd) +{ + struct se_subsystem_dev *su_dev = SE_DEV(cmd)->se_sub_dev; + struct se_port *port; + struct t10_alua_tg_pt_gp *tg_pt_gp; + struct t10_alua_tg_pt_gp_member *tg_pt_gp_mem; + unsigned char *buf = (unsigned char *)T_TASK(cmd)->t_task_buf; + u32 rd_len = 0, off = 4; /* Skip over RESERVED area to first + Target port group descriptor */ + + spin_lock(&T10_ALUA(su_dev)->tg_pt_gps_lock); + list_for_each_entry(tg_pt_gp, &T10_ALUA(su_dev)->tg_pt_gps_list, + tg_pt_gp_list) { + /* + * PREF: Preferred target port bit, determine if this + * bit should be set for port group. + */ + if (tg_pt_gp->tg_pt_gp_pref) + buf[off] = 0x80; + /* + * Set the ASYMMETRIC ACCESS State + */ + buf[off++] |= (atomic_read( + &tg_pt_gp->tg_pt_gp_alua_access_state) & 0xff); + /* + * Set supported ASYMMETRIC ACCESS State bits + */ + buf[off] = 0x80; /* T_SUP */ + buf[off] |= 0x40; /* O_SUP */ + buf[off] |= 0x8; /* U_SUP */ + buf[off] |= 0x4; /* S_SUP */ + buf[off] |= 0x2; /* AN_SUP */ + buf[off++] |= 0x1; /* AO_SUP */ + /* + * TARGET PORT GROUP + */ + buf[off++] = ((tg_pt_gp->tg_pt_gp_id >> 8) & 0xff); + buf[off++] = (tg_pt_gp->tg_pt_gp_id & 0xff); + + off++; /* Skip over Reserved */ + /* + * STATUS CODE + */ + buf[off++] = (tg_pt_gp->tg_pt_gp_alua_access_status & 0xff); + /* + * Vendor Specific field + */ + buf[off++] = 0x00; + /* + * TARGET PORT COUNT + */ + buf[off++] = (tg_pt_gp->tg_pt_gp_members & 0xff); + rd_len += 8; + + spin_lock(&tg_pt_gp->tg_pt_gp_lock); + list_for_each_entry(tg_pt_gp_mem, &tg_pt_gp->tg_pt_gp_mem_list, + tg_pt_gp_mem_list) { + port = tg_pt_gp_mem->tg_pt; + /* + * Start Target Port descriptor format + * + * See spc4r17 section 6.2.7 Table 247 + */ + off += 2; /* Skip over Obsolete */ + /* + * Set RELATIVE TARGET PORT IDENTIFIER + */ + buf[off++] = ((port->sep_rtpi >> 8) & 0xff); + buf[off++] = (port->sep_rtpi & 0xff); + rd_len += 4; + } + spin_unlock(&tg_pt_gp->tg_pt_gp_lock); + } + spin_unlock(&T10_ALUA(su_dev)->tg_pt_gps_lock); + /* + * Set the RETURN DATA LENGTH set in the header of the DataIN Payload + */ + buf[0] = ((rd_len >> 24) & 0xff); + buf[1] = ((rd_len >> 16) & 0xff); + buf[2] = ((rd_len >> 8) & 0xff); + buf[3] = (rd_len & 0xff); + + return 0; +} + +/* + * SET_TARGET_PORT_GROUPS for explict ALUA operation. + * + * See spc4r17 section 6.35 + */ +int core_emulate_set_target_port_groups(struct se_cmd *cmd) +{ + struct se_device *dev = SE_DEV(cmd); + struct se_subsystem_dev *su_dev = SE_DEV(cmd)->se_sub_dev; + struct se_port *port, *l_port = SE_LUN(cmd)->lun_sep; + struct se_node_acl *nacl = SE_SESS(cmd)->se_node_acl; + struct t10_alua_tg_pt_gp *tg_pt_gp = NULL, *l_tg_pt_gp; + struct t10_alua_tg_pt_gp_member *tg_pt_gp_mem, *l_tg_pt_gp_mem; + unsigned char *buf = (unsigned char *)T_TASK(cmd)->t_task_buf; + unsigned char *ptr = &buf[4]; /* Skip over RESERVED area in header */ + u32 len = 4; /* Skip over RESERVED area in header */ + int alua_access_state, primary = 0, rc; + u16 tg_pt_id, rtpi; + + if (!(l_port)) + return PYX_TRANSPORT_LU_COMM_FAILURE; + /* + * Determine if explict ALUA via SET_TARGET_PORT_GROUPS is allowed + * for the local tg_pt_gp. + */ + l_tg_pt_gp_mem = l_port->sep_alua_tg_pt_gp_mem; + if (!(l_tg_pt_gp_mem)) { + printk(KERN_ERR "Unable to access l_port->sep_alua_tg_pt_gp_mem\n"); + return PYX_TRANSPORT_UNKNOWN_SAM_OPCODE; + } + spin_lock(&l_tg_pt_gp_mem->tg_pt_gp_mem_lock); + l_tg_pt_gp = l_tg_pt_gp_mem->tg_pt_gp; + if (!(l_tg_pt_gp)) { + spin_unlock(&l_tg_pt_gp_mem->tg_pt_gp_mem_lock); + printk(KERN_ERR "Unable to access *l_tg_pt_gp_mem->tg_pt_gp\n"); + return PYX_TRANSPORT_UNKNOWN_SAM_OPCODE; + } + rc = (l_tg_pt_gp->tg_pt_gp_alua_access_type & TPGS_EXPLICT_ALUA); + spin_unlock(&l_tg_pt_gp_mem->tg_pt_gp_mem_lock); + + if (!(rc)) { + printk(KERN_INFO "Unable to process SET_TARGET_PORT_GROUPS" + " while TPGS_EXPLICT_ALUA is disabled\n"); + return PYX_TRANSPORT_UNKNOWN_SAM_OPCODE; + } + + while (len < cmd->data_length) { + alua_access_state = (ptr[0] & 0x0f); + /* + * Check the received ALUA access state, and determine if + * the state is a primary or secondary target port asymmetric + * access state. + */ + rc = core_alua_check_transition(alua_access_state, &primary); + if (rc != 0) { + /* + * If the SET TARGET PORT GROUPS attempts to establish + * an invalid combination of target port asymmetric + * access states or attempts to establish an + * unsupported target port asymmetric access state, + * then the command shall be terminated with CHECK + * CONDITION status, with the sense key set to ILLEGAL + * REQUEST, and the additional sense code set to INVALID + * FIELD IN PARAMETER LIST. + */ + return PYX_TRANSPORT_INVALID_PARAMETER_LIST; + } + rc = -1; + /* + * If the ASYMMETRIC ACCESS STATE field (see table 267) + * specifies a primary target port asymmetric access state, + * then the TARGET PORT GROUP OR TARGET PORT field specifies + * a primary target port group for which the primary target + * port asymmetric access state shall be changed. If the + * ASYMMETRIC ACCESS STATE field specifies a secondary target + * port asymmetric access state, then the TARGET PORT GROUP OR + * TARGET PORT field specifies the relative target port + * identifier (see 3.1.120) of the target port for which the + * secondary target port asymmetric access state shall be + * changed. + */ + if (primary) { + tg_pt_id = ((ptr[2] << 8) & 0xff); + tg_pt_id |= (ptr[3] & 0xff); + /* + * Locate the matching target port group ID from + * the global tg_pt_gp list + */ + spin_lock(&T10_ALUA(su_dev)->tg_pt_gps_lock); + list_for_each_entry(tg_pt_gp, + &T10_ALUA(su_dev)->tg_pt_gps_list, + tg_pt_gp_list) { + if (!(tg_pt_gp->tg_pt_gp_valid_id)) + continue; + + if (tg_pt_id != tg_pt_gp->tg_pt_gp_id) + continue; + + atomic_inc(&tg_pt_gp->tg_pt_gp_ref_cnt); + smp_mb__after_atomic_inc(); + spin_unlock(&T10_ALUA(su_dev)->tg_pt_gps_lock); + + rc = core_alua_do_port_transition(tg_pt_gp, + dev, l_port, nacl, + alua_access_state, 1); + + spin_lock(&T10_ALUA(su_dev)->tg_pt_gps_lock); + atomic_dec(&tg_pt_gp->tg_pt_gp_ref_cnt); + smp_mb__after_atomic_dec(); + break; + } + spin_unlock(&T10_ALUA(su_dev)->tg_pt_gps_lock); + /* + * If not matching target port group ID can be located + * throw an exception with ASCQ: INVALID_PARAMETER_LIST + */ + if (rc != 0) + return PYX_TRANSPORT_INVALID_PARAMETER_LIST; + } else { + /* + * Extact the RELATIVE TARGET PORT IDENTIFIER to identify + * the Target Port in question for the the incoming + * SET_TARGET_PORT_GROUPS op. + */ + rtpi = ((ptr[2] << 8) & 0xff); + rtpi |= (ptr[3] & 0xff); + /* + * Locate the matching relative target port identifer + * for the struct se_device storage object. + */ + spin_lock(&dev->se_port_lock); + list_for_each_entry(port, &dev->dev_sep_list, + sep_list) { + if (port->sep_rtpi != rtpi) + continue; + + tg_pt_gp_mem = port->sep_alua_tg_pt_gp_mem; + spin_unlock(&dev->se_port_lock); + + rc = core_alua_set_tg_pt_secondary_state( + tg_pt_gp_mem, port, 1, 1); + + spin_lock(&dev->se_port_lock); + break; + } + spin_unlock(&dev->se_port_lock); + /* + * If not matching relative target port identifier can + * be located, throw an exception with ASCQ: + * INVALID_PARAMETER_LIST + */ + if (rc != 0) + return PYX_TRANSPORT_INVALID_PARAMETER_LIST; + } + + ptr += 4; + len += 4; + } + + return 0; +} + +static inline int core_alua_state_nonoptimized( + struct se_cmd *cmd, + unsigned char *cdb, + int nonop_delay_msecs, + u8 *alua_ascq) +{ + /* + * Set SCF_ALUA_NON_OPTIMIZED here, this value will be checked + * later to determine if processing of this cmd needs to be + * temporarily delayed for the Active/NonOptimized primary access state. + */ + cmd->se_cmd_flags |= SCF_ALUA_NON_OPTIMIZED; + cmd->alua_nonop_delay = nonop_delay_msecs; + return 0; +} + +static inline int core_alua_state_standby( + struct se_cmd *cmd, + unsigned char *cdb, + u8 *alua_ascq) +{ + /* + * Allowed CDBs for ALUA_ACCESS_STATE_STANDBY as defined by + * spc4r17 section 5.9.2.4.4 + */ + switch (cdb[0]) { + case INQUIRY: + case LOG_SELECT: + case LOG_SENSE: + case MODE_SELECT: + case MODE_SENSE: + case REPORT_LUNS: + case RECEIVE_DIAGNOSTIC: + case SEND_DIAGNOSTIC: + case MAINTENANCE_IN: + switch (cdb[1]) { + case MI_REPORT_TARGET_PGS: + return 0; + default: + *alua_ascq = ASCQ_04H_ALUA_TG_PT_STANDBY; + return 1; + } + case MAINTENANCE_OUT: + switch (cdb[1]) { + case MO_SET_TARGET_PGS: + return 0; + default: + *alua_ascq = ASCQ_04H_ALUA_TG_PT_STANDBY; + return 1; + } + case REQUEST_SENSE: + case PERSISTENT_RESERVE_IN: + case PERSISTENT_RESERVE_OUT: + case READ_BUFFER: + case WRITE_BUFFER: + return 0; + default: + *alua_ascq = ASCQ_04H_ALUA_TG_PT_STANDBY; + return 1; + } + + return 0; +} + +static inline int core_alua_state_unavailable( + struct se_cmd *cmd, + unsigned char *cdb, + u8 *alua_ascq) +{ + /* + * Allowed CDBs for ALUA_ACCESS_STATE_UNAVAILABLE as defined by + * spc4r17 section 5.9.2.4.5 + */ + switch (cdb[0]) { + case INQUIRY: + case REPORT_LUNS: + case MAINTENANCE_IN: + switch (cdb[1]) { + case MI_REPORT_TARGET_PGS: + return 0; + default: + *alua_ascq = ASCQ_04H_ALUA_TG_PT_UNAVAILABLE; + return 1; + } + case MAINTENANCE_OUT: + switch (cdb[1]) { + case MO_SET_TARGET_PGS: + return 0; + default: + *alua_ascq = ASCQ_04H_ALUA_TG_PT_UNAVAILABLE; + return 1; + } + case REQUEST_SENSE: + case READ_BUFFER: + case WRITE_BUFFER: + return 0; + default: + *alua_ascq = ASCQ_04H_ALUA_TG_PT_UNAVAILABLE; + return 1; + } + + return 0; +} + +static inline int core_alua_state_transition( + struct se_cmd *cmd, + unsigned char *cdb, + u8 *alua_ascq) +{ + /* + * Allowed CDBs for ALUA_ACCESS_STATE_TRANSITIO as defined by + * spc4r17 section 5.9.2.5 + */ + switch (cdb[0]) { + case INQUIRY: + case REPORT_LUNS: + case MAINTENANCE_IN: + switch (cdb[1]) { + case MI_REPORT_TARGET_PGS: + return 0; + default: + *alua_ascq = ASCQ_04H_ALUA_STATE_TRANSITION; + return 1; + } + case REQUEST_SENSE: + case READ_BUFFER: + case WRITE_BUFFER: + return 0; + default: + *alua_ascq = ASCQ_04H_ALUA_STATE_TRANSITION; + return 1; + } + + return 0; +} + +/* + * Used for alua_type SPC_ALUA_PASSTHROUGH and SPC2_ALUA_DISABLED + * in transport_cmd_sequencer(). This function is assigned to + * struct t10_alua *->state_check() in core_setup_alua() + */ +static int core_alua_state_check_nop( + struct se_cmd *cmd, + unsigned char *cdb, + u8 *alua_ascq) +{ + return 0; +} + +/* + * Used for alua_type SPC3_ALUA_EMULATED in transport_cmd_sequencer(). + * This function is assigned to struct t10_alua *->state_check() in + * core_setup_alua() + * + * Also, this function can return three different return codes to + * signal transport_generic_cmd_sequencer() + * + * return 1: Is used to signal LUN not accecsable, and check condition/not ready + * return 0: Used to signal success + * reutrn -1: Used to signal failure, and invalid cdb field + */ +static int core_alua_state_check( + struct se_cmd *cmd, + unsigned char *cdb, + u8 *alua_ascq) +{ + struct se_lun *lun = SE_LUN(cmd); + struct se_port *port = lun->lun_sep; + struct t10_alua_tg_pt_gp *tg_pt_gp; + struct t10_alua_tg_pt_gp_member *tg_pt_gp_mem; + int out_alua_state, nonop_delay_msecs; + + if (!(port)) + return 0; + /* + * First, check for a struct se_port specific secondary ALUA target port + * access state: OFFLINE + */ + if (atomic_read(&port->sep_tg_pt_secondary_offline)) { + *alua_ascq = ASCQ_04H_ALUA_OFFLINE; + printk(KERN_INFO "ALUA: Got secondary offline status for local" + " target port\n"); + *alua_ascq = ASCQ_04H_ALUA_OFFLINE; + return 1; + } + /* + * Second, obtain the struct t10_alua_tg_pt_gp_member pointer to the + * ALUA target port group, to obtain current ALUA access state. + * Otherwise look for the underlying struct se_device association with + * a ALUA logical unit group. + */ + tg_pt_gp_mem = port->sep_alua_tg_pt_gp_mem; + spin_lock(&tg_pt_gp_mem->tg_pt_gp_mem_lock); + tg_pt_gp = tg_pt_gp_mem->tg_pt_gp; + out_alua_state = atomic_read(&tg_pt_gp->tg_pt_gp_alua_access_state); + nonop_delay_msecs = tg_pt_gp->tg_pt_gp_nonop_delay_msecs; + spin_unlock(&tg_pt_gp_mem->tg_pt_gp_mem_lock); + /* + * Process ALUA_ACCESS_STATE_ACTIVE_OPTMIZED in a seperate conditional + * statement so the complier knows explictly to check this case first. + * For the Optimized ALUA access state case, we want to process the + * incoming fabric cmd ASAP.. + */ + if (out_alua_state == ALUA_ACCESS_STATE_ACTIVE_OPTMIZED) + return 0; + + switch (out_alua_state) { + case ALUA_ACCESS_STATE_ACTIVE_NON_OPTIMIZED: + return core_alua_state_nonoptimized(cmd, cdb, + nonop_delay_msecs, alua_ascq); + case ALUA_ACCESS_STATE_STANDBY: + return core_alua_state_standby(cmd, cdb, alua_ascq); + case ALUA_ACCESS_STATE_UNAVAILABLE: + return core_alua_state_unavailable(cmd, cdb, alua_ascq); + case ALUA_ACCESS_STATE_TRANSITION: + return core_alua_state_transition(cmd, cdb, alua_ascq); + /* + * OFFLINE is a secondary ALUA target port group access state, that is + * handled above with struct se_port->sep_tg_pt_secondary_offline=1 + */ + case ALUA_ACCESS_STATE_OFFLINE: + default: + printk(KERN_ERR "Unknown ALUA access state: 0x%02x\n", + out_alua_state); + return -1; + } + + return 0; +} + +/* + * Check implict and explict ALUA state change request. + */ +static int core_alua_check_transition(int state, int *primary) +{ + switch (state) { + case ALUA_ACCESS_STATE_ACTIVE_OPTMIZED: + case ALUA_ACCESS_STATE_ACTIVE_NON_OPTIMIZED: + case ALUA_ACCESS_STATE_STANDBY: + case ALUA_ACCESS_STATE_UNAVAILABLE: + /* + * OPTIMIZED, NON-OPTIMIZED, STANDBY and UNAVAILABLE are + * defined as primary target port asymmetric access states. + */ + *primary = 1; + break; + case ALUA_ACCESS_STATE_OFFLINE: + /* + * OFFLINE state is defined as a secondary target port + * asymmetric access state. + */ + *primary = 0; + break; + default: + printk(KERN_ERR "Unknown ALUA access state: 0x%02x\n", state); + return -1; + } + + return 0; +} + +static char *core_alua_dump_state(int state) +{ + switch (state) { + case ALUA_ACCESS_STATE_ACTIVE_OPTMIZED: + return "Active/Optimized"; + case ALUA_ACCESS_STATE_ACTIVE_NON_OPTIMIZED: + return "Active/NonOptimized"; + case ALUA_ACCESS_STATE_STANDBY: + return "Standby"; + case ALUA_ACCESS_STATE_UNAVAILABLE: + return "Unavailable"; + case ALUA_ACCESS_STATE_OFFLINE: + return "Offline"; + default: + return "Unknown"; + } + + return NULL; +} + +char *core_alua_dump_status(int status) +{ + switch (status) { + case ALUA_STATUS_NONE: + return "None"; + case ALUA_STATUS_ALTERED_BY_EXPLICT_STPG: + return "Altered by Explict STPG"; + case ALUA_STATUS_ALTERED_BY_IMPLICT_ALUA: + return "Altered by Implict ALUA"; + default: + return "Unknown"; + } + + return NULL; +} + +/* + * Used by fabric modules to determine when we need to delay processing + * for the Active/NonOptimized paths.. + */ +int core_alua_check_nonop_delay( + struct se_cmd *cmd) +{ + if (!(cmd->se_cmd_flags & SCF_ALUA_NON_OPTIMIZED)) + return 0; + if (in_interrupt()) + return 0; + /* + * The ALUA Active/NonOptimized access state delay can be disabled + * in via configfs with a value of zero + */ + if (!(cmd->alua_nonop_delay)) + return 0; + /* + * struct se_cmd->alua_nonop_delay gets set by a target port group + * defined interval in core_alua_state_nonoptimized() + */ + msleep_interruptible(cmd->alua_nonop_delay); + return 0; +} +EXPORT_SYMBOL(core_alua_check_nonop_delay); + +/* + * Called with tg_pt_gp->tg_pt_gp_md_mutex or tg_pt_gp_mem->sep_tg_pt_md_mutex + * + */ +static int core_alua_write_tpg_metadata( + const char *path, + unsigned char *md_buf, + u32 md_buf_len) +{ + mm_segment_t old_fs; + struct file *file; + struct iovec iov[1]; + int flags = O_RDWR | O_CREAT | O_TRUNC, ret; + + memset(iov, 0, sizeof(struct iovec)); + + file = filp_open(path, flags, 0600); + if (IS_ERR(file) || !file || !file->f_dentry) { + printk(KERN_ERR "filp_open(%s) for ALUA metadata failed\n", + path); + return -ENODEV; + } + + iov[0].iov_base = &md_buf[0]; + iov[0].iov_len = md_buf_len; + + old_fs = get_fs(); + set_fs(get_ds()); + ret = vfs_writev(file, &iov[0], 1, &file->f_pos); + set_fs(old_fs); + + if (ret < 0) { + printk(KERN_ERR "Error writing ALUA metadata file: %s\n", path); + filp_close(file, NULL); + return -EIO; + } + filp_close(file, NULL); + + return 0; +} + +/* + * Called with tg_pt_gp->tg_pt_gp_md_mutex held + */ +static int core_alua_update_tpg_primary_metadata( + struct t10_alua_tg_pt_gp *tg_pt_gp, + int primary_state, + unsigned char *md_buf) +{ + struct se_subsystem_dev *su_dev = tg_pt_gp->tg_pt_gp_su_dev; + struct t10_wwn *wwn = &su_dev->t10_wwn; + char path[ALUA_METADATA_PATH_LEN]; + int len; + + memset(path, 0, ALUA_METADATA_PATH_LEN); + + len = snprintf(md_buf, tg_pt_gp->tg_pt_gp_md_buf_len, + "tg_pt_gp_id=%hu\n" + "alua_access_state=0x%02x\n" + "alua_access_status=0x%02x\n", + tg_pt_gp->tg_pt_gp_id, primary_state, + tg_pt_gp->tg_pt_gp_alua_access_status); + + snprintf(path, ALUA_METADATA_PATH_LEN, + "/var/target/alua/tpgs_%s/%s", &wwn->unit_serial[0], + config_item_name(&tg_pt_gp->tg_pt_gp_group.cg_item)); + + return core_alua_write_tpg_metadata(path, md_buf, len); +} + +static int core_alua_do_transition_tg_pt( + struct t10_alua_tg_pt_gp *tg_pt_gp, + struct se_port *l_port, + struct se_node_acl *nacl, + unsigned char *md_buf, + int new_state, + int explict) +{ + struct se_dev_entry *se_deve; + struct se_lun_acl *lacl; + struct se_port *port; + struct t10_alua_tg_pt_gp_member *mem; + int old_state = 0; + /* + * Save the old primary ALUA access state, and set the current state + * to ALUA_ACCESS_STATE_TRANSITION. + */ + old_state = atomic_read(&tg_pt_gp->tg_pt_gp_alua_access_state); + atomic_set(&tg_pt_gp->tg_pt_gp_alua_access_state, + ALUA_ACCESS_STATE_TRANSITION); + tg_pt_gp->tg_pt_gp_alua_access_status = (explict) ? + ALUA_STATUS_ALTERED_BY_EXPLICT_STPG : + ALUA_STATUS_ALTERED_BY_IMPLICT_ALUA; + /* + * Check for the optional ALUA primary state transition delay + */ + if (tg_pt_gp->tg_pt_gp_trans_delay_msecs != 0) + msleep_interruptible(tg_pt_gp->tg_pt_gp_trans_delay_msecs); + + spin_lock(&tg_pt_gp->tg_pt_gp_lock); + list_for_each_entry(mem, &tg_pt_gp->tg_pt_gp_mem_list, + tg_pt_gp_mem_list) { + port = mem->tg_pt; + /* + * After an implicit target port asymmetric access state + * change, a device server shall establish a unit attention + * condition for the initiator port associated with every I_T + * nexus with the additional sense code set to ASYMMETRIC + * ACCESS STATE CHAGED. + * + * After an explicit target port asymmetric access state + * change, a device server shall establish a unit attention + * condition with the additional sense code set to ASYMMETRIC + * ACCESS STATE CHANGED for the initiator port associated with + * every I_T nexus other than the I_T nexus on which the SET + * TARGET PORT GROUPS command + */ + atomic_inc(&mem->tg_pt_gp_mem_ref_cnt); + smp_mb__after_atomic_inc(); + spin_unlock(&tg_pt_gp->tg_pt_gp_lock); + + spin_lock_bh(&port->sep_alua_lock); + list_for_each_entry(se_deve, &port->sep_alua_list, + alua_port_list) { + lacl = se_deve->se_lun_acl; + /* + * se_deve->se_lun_acl pointer may be NULL for a + * entry created without explict Node+MappedLUN ACLs + */ + if (!(lacl)) + continue; + + if (explict && + (nacl != NULL) && (nacl == lacl->se_lun_nacl) && + (l_port != NULL) && (l_port == port)) + continue; + + core_scsi3_ua_allocate(lacl->se_lun_nacl, + se_deve->mapped_lun, 0x2A, + ASCQ_2AH_ASYMMETRIC_ACCESS_STATE_CHANGED); + } + spin_unlock_bh(&port->sep_alua_lock); + + spin_lock(&tg_pt_gp->tg_pt_gp_lock); + atomic_dec(&mem->tg_pt_gp_mem_ref_cnt); + smp_mb__after_atomic_dec(); + } + spin_unlock(&tg_pt_gp->tg_pt_gp_lock); + /* + * Update the ALUA metadata buf that has been allocated in + * core_alua_do_port_transition(), this metadata will be written + * to struct file. + * + * Note that there is the case where we do not want to update the + * metadata when the saved metadata is being parsed in userspace + * when setting the existing port access state and access status. + * + * Also note that the failure to write out the ALUA metadata to + * struct file does NOT affect the actual ALUA transition. + */ + if (tg_pt_gp->tg_pt_gp_write_metadata) { + mutex_lock(&tg_pt_gp->tg_pt_gp_md_mutex); + core_alua_update_tpg_primary_metadata(tg_pt_gp, + new_state, md_buf); + mutex_unlock(&tg_pt_gp->tg_pt_gp_md_mutex); + } + /* + * Set the current primary ALUA access state to the requested new state + */ + atomic_set(&tg_pt_gp->tg_pt_gp_alua_access_state, new_state); + + printk(KERN_INFO "Successful %s ALUA transition TG PT Group: %s ID: %hu" + " from primary access state %s to %s\n", (explict) ? "explict" : + "implict", config_item_name(&tg_pt_gp->tg_pt_gp_group.cg_item), + tg_pt_gp->tg_pt_gp_id, core_alua_dump_state(old_state), + core_alua_dump_state(new_state)); + + return 0; +} + +int core_alua_do_port_transition( + struct t10_alua_tg_pt_gp *l_tg_pt_gp, + struct se_device *l_dev, + struct se_port *l_port, + struct se_node_acl *l_nacl, + int new_state, + int explict) +{ + struct se_device *dev; + struct se_port *port; + struct se_subsystem_dev *su_dev; + struct se_node_acl *nacl; + struct t10_alua_lu_gp *lu_gp; + struct t10_alua_lu_gp_member *lu_gp_mem, *local_lu_gp_mem; + struct t10_alua_tg_pt_gp *tg_pt_gp; + unsigned char *md_buf; + int primary; + + if (core_alua_check_transition(new_state, &primary) != 0) + return -EINVAL; + + md_buf = kzalloc(l_tg_pt_gp->tg_pt_gp_md_buf_len, GFP_KERNEL); + if (!(md_buf)) { + printk("Unable to allocate buf for ALUA metadata\n"); + return -ENOMEM; + } + + local_lu_gp_mem = l_dev->dev_alua_lu_gp_mem; + spin_lock(&local_lu_gp_mem->lu_gp_mem_lock); + lu_gp = local_lu_gp_mem->lu_gp; + atomic_inc(&lu_gp->lu_gp_ref_cnt); + smp_mb__after_atomic_inc(); + spin_unlock(&local_lu_gp_mem->lu_gp_mem_lock); + /* + * For storage objects that are members of the 'default_lu_gp', + * we only do transition on the passed *l_tp_pt_gp, and not + * on all of the matching target port groups IDs in default_lu_gp. + */ + if (!(lu_gp->lu_gp_id)) { + /* + * core_alua_do_transition_tg_pt() will always return + * success. + */ + core_alua_do_transition_tg_pt(l_tg_pt_gp, l_port, l_nacl, + md_buf, new_state, explict); + atomic_dec(&lu_gp->lu_gp_ref_cnt); + smp_mb__after_atomic_dec(); + kfree(md_buf); + return 0; + } + /* + * For all other LU groups aside from 'default_lu_gp', walk all of + * the associated storage objects looking for a matching target port + * group ID from the local target port group. + */ + spin_lock(&lu_gp->lu_gp_lock); + list_for_each_entry(lu_gp_mem, &lu_gp->lu_gp_mem_list, + lu_gp_mem_list) { + + dev = lu_gp_mem->lu_gp_mem_dev; + su_dev = dev->se_sub_dev; + atomic_inc(&lu_gp_mem->lu_gp_mem_ref_cnt); + smp_mb__after_atomic_inc(); + spin_unlock(&lu_gp->lu_gp_lock); + + spin_lock(&T10_ALUA(su_dev)->tg_pt_gps_lock); + list_for_each_entry(tg_pt_gp, + &T10_ALUA(su_dev)->tg_pt_gps_list, + tg_pt_gp_list) { + + if (!(tg_pt_gp->tg_pt_gp_valid_id)) + continue; + /* + * If the target behavior port asymmetric access state + * is changed for any target port group accessiable via + * a logical unit within a LU group, the target port + * behavior group asymmetric access states for the same + * target port group accessible via other logical units + * in that LU group will also change. + */ + if (l_tg_pt_gp->tg_pt_gp_id != tg_pt_gp->tg_pt_gp_id) + continue; + + if (l_tg_pt_gp == tg_pt_gp) { + port = l_port; + nacl = l_nacl; + } else { + port = NULL; + nacl = NULL; + } + atomic_inc(&tg_pt_gp->tg_pt_gp_ref_cnt); + smp_mb__after_atomic_inc(); + spin_unlock(&T10_ALUA(su_dev)->tg_pt_gps_lock); + /* + * core_alua_do_transition_tg_pt() will always return + * success. + */ + core_alua_do_transition_tg_pt(tg_pt_gp, port, + nacl, md_buf, new_state, explict); + + spin_lock(&T10_ALUA(su_dev)->tg_pt_gps_lock); + atomic_dec(&tg_pt_gp->tg_pt_gp_ref_cnt); + smp_mb__after_atomic_dec(); + } + spin_unlock(&T10_ALUA(su_dev)->tg_pt_gps_lock); + + spin_lock(&lu_gp->lu_gp_lock); + atomic_dec(&lu_gp_mem->lu_gp_mem_ref_cnt); + smp_mb__after_atomic_dec(); + } + spin_unlock(&lu_gp->lu_gp_lock); + + printk(KERN_INFO "Successfully processed LU Group: %s all ALUA TG PT" + " Group IDs: %hu %s transition to primary state: %s\n", + config_item_name(&lu_gp->lu_gp_group.cg_item), + l_tg_pt_gp->tg_pt_gp_id, (explict) ? "explict" : "implict", + core_alua_dump_state(new_state)); + + atomic_dec(&lu_gp->lu_gp_ref_cnt); + smp_mb__after_atomic_dec(); + kfree(md_buf); + return 0; +} + +/* + * Called with tg_pt_gp_mem->sep_tg_pt_md_mutex held + */ +static int core_alua_update_tpg_secondary_metadata( + struct t10_alua_tg_pt_gp_member *tg_pt_gp_mem, + struct se_port *port, + unsigned char *md_buf, + u32 md_buf_len) +{ + struct se_portal_group *se_tpg = port->sep_tpg; + char path[ALUA_METADATA_PATH_LEN], wwn[ALUA_SECONDARY_METADATA_WWN_LEN]; + int len; + + memset(path, 0, ALUA_METADATA_PATH_LEN); + memset(wwn, 0, ALUA_SECONDARY_METADATA_WWN_LEN); + + len = snprintf(wwn, ALUA_SECONDARY_METADATA_WWN_LEN, "%s", + TPG_TFO(se_tpg)->tpg_get_wwn(se_tpg)); + + if (TPG_TFO(se_tpg)->tpg_get_tag != NULL) + snprintf(wwn+len, ALUA_SECONDARY_METADATA_WWN_LEN-len, "+%hu", + TPG_TFO(se_tpg)->tpg_get_tag(se_tpg)); + + len = snprintf(md_buf, md_buf_len, "alua_tg_pt_offline=%d\n" + "alua_tg_pt_status=0x%02x\n", + atomic_read(&port->sep_tg_pt_secondary_offline), + port->sep_tg_pt_secondary_stat); + + snprintf(path, ALUA_METADATA_PATH_LEN, "/var/target/alua/%s/%s/lun_%u", + TPG_TFO(se_tpg)->get_fabric_name(), wwn, + port->sep_lun->unpacked_lun); + + return core_alua_write_tpg_metadata(path, md_buf, len); +} + +static int core_alua_set_tg_pt_secondary_state( + struct t10_alua_tg_pt_gp_member *tg_pt_gp_mem, + struct se_port *port, + int explict, + int offline) +{ + struct t10_alua_tg_pt_gp *tg_pt_gp; + unsigned char *md_buf; + u32 md_buf_len; + int trans_delay_msecs; + + spin_lock(&tg_pt_gp_mem->tg_pt_gp_mem_lock); + tg_pt_gp = tg_pt_gp_mem->tg_pt_gp; + if (!(tg_pt_gp)) { + spin_unlock(&tg_pt_gp_mem->tg_pt_gp_mem_lock); + printk(KERN_ERR "Unable to complete secondary state" + " transition\n"); + return -1; + } + trans_delay_msecs = tg_pt_gp->tg_pt_gp_trans_delay_msecs; + /* + * Set the secondary ALUA target port access state to OFFLINE + * or release the previously secondary state for struct se_port + */ + if (offline) + atomic_set(&port->sep_tg_pt_secondary_offline, 1); + else + atomic_set(&port->sep_tg_pt_secondary_offline, 0); + + md_buf_len = tg_pt_gp->tg_pt_gp_md_buf_len; + port->sep_tg_pt_secondary_stat = (explict) ? + ALUA_STATUS_ALTERED_BY_EXPLICT_STPG : + ALUA_STATUS_ALTERED_BY_IMPLICT_ALUA; + + printk(KERN_INFO "Successful %s ALUA transition TG PT Group: %s ID: %hu" + " to secondary access state: %s\n", (explict) ? "explict" : + "implict", config_item_name(&tg_pt_gp->tg_pt_gp_group.cg_item), + tg_pt_gp->tg_pt_gp_id, (offline) ? "OFFLINE" : "ONLINE"); + + spin_unlock(&tg_pt_gp_mem->tg_pt_gp_mem_lock); + /* + * Do the optional transition delay after we set the secondary + * ALUA access state. + */ + if (trans_delay_msecs != 0) + msleep_interruptible(trans_delay_msecs); + /* + * See if we need to update the ALUA fabric port metadata for + * secondary state and status + */ + if (port->sep_tg_pt_secondary_write_md) { + md_buf = kzalloc(md_buf_len, GFP_KERNEL); + if (!(md_buf)) { + printk(KERN_ERR "Unable to allocate md_buf for" + " secondary ALUA access metadata\n"); + return -1; + } + mutex_lock(&port->sep_tg_pt_md_mutex); + core_alua_update_tpg_secondary_metadata(tg_pt_gp_mem, port, + md_buf, md_buf_len); + mutex_unlock(&port->sep_tg_pt_md_mutex); + + kfree(md_buf); + } + + return 0; +} + +struct t10_alua_lu_gp * +core_alua_allocate_lu_gp(const char *name, int def_group) +{ + struct t10_alua_lu_gp *lu_gp; + + lu_gp = kmem_cache_zalloc(t10_alua_lu_gp_cache, GFP_KERNEL); + if (!(lu_gp)) { + printk(KERN_ERR "Unable to allocate struct t10_alua_lu_gp\n"); + return ERR_PTR(-ENOMEM);; + } + INIT_LIST_HEAD(&lu_gp->lu_gp_list); + INIT_LIST_HEAD(&lu_gp->lu_gp_mem_list); + spin_lock_init(&lu_gp->lu_gp_lock); + atomic_set(&lu_gp->lu_gp_ref_cnt, 0); + + if (def_group) { + lu_gp->lu_gp_id = se_global->alua_lu_gps_counter++;; + lu_gp->lu_gp_valid_id = 1; + se_global->alua_lu_gps_count++; + } + + return lu_gp; +} + +int core_alua_set_lu_gp_id(struct t10_alua_lu_gp *lu_gp, u16 lu_gp_id) +{ + struct t10_alua_lu_gp *lu_gp_tmp; + u16 lu_gp_id_tmp; + /* + * The lu_gp->lu_gp_id may only be set once.. + */ + if (lu_gp->lu_gp_valid_id) { + printk(KERN_WARNING "ALUA LU Group already has a valid ID," + " ignoring request\n"); + return -1; + } + + spin_lock(&se_global->lu_gps_lock); + if (se_global->alua_lu_gps_count == 0x0000ffff) { + printk(KERN_ERR "Maximum ALUA se_global->alua_lu_gps_count:" + " 0x0000ffff reached\n"); + spin_unlock(&se_global->lu_gps_lock); + kmem_cache_free(t10_alua_lu_gp_cache, lu_gp); + return -1; + } +again: + lu_gp_id_tmp = (lu_gp_id != 0) ? lu_gp_id : + se_global->alua_lu_gps_counter++; + + list_for_each_entry(lu_gp_tmp, &se_global->g_lu_gps_list, lu_gp_list) { + if (lu_gp_tmp->lu_gp_id == lu_gp_id_tmp) { + if (!(lu_gp_id)) + goto again; + + printk(KERN_WARNING "ALUA Logical Unit Group ID: %hu" + " already exists, ignoring request\n", + lu_gp_id); + spin_unlock(&se_global->lu_gps_lock); + return -1; + } + } + + lu_gp->lu_gp_id = lu_gp_id_tmp; + lu_gp->lu_gp_valid_id = 1; + list_add_tail(&lu_gp->lu_gp_list, &se_global->g_lu_gps_list); + se_global->alua_lu_gps_count++; + spin_unlock(&se_global->lu_gps_lock); + + return 0; +} + +static struct t10_alua_lu_gp_member * +core_alua_allocate_lu_gp_mem(struct se_device *dev) +{ + struct t10_alua_lu_gp_member *lu_gp_mem; + + lu_gp_mem = kmem_cache_zalloc(t10_alua_lu_gp_mem_cache, GFP_KERNEL); + if (!(lu_gp_mem)) { + printk(KERN_ERR "Unable to allocate struct t10_alua_lu_gp_member\n"); + return ERR_PTR(-ENOMEM); + } + INIT_LIST_HEAD(&lu_gp_mem->lu_gp_mem_list); + spin_lock_init(&lu_gp_mem->lu_gp_mem_lock); + atomic_set(&lu_gp_mem->lu_gp_mem_ref_cnt, 0); + + lu_gp_mem->lu_gp_mem_dev = dev; + dev->dev_alua_lu_gp_mem = lu_gp_mem; + + return lu_gp_mem; +} + +void core_alua_free_lu_gp(struct t10_alua_lu_gp *lu_gp) +{ + struct t10_alua_lu_gp_member *lu_gp_mem, *lu_gp_mem_tmp; + /* + * Once we have reached this point, config_item_put() has + * already been called from target_core_alua_drop_lu_gp(). + * + * Here, we remove the *lu_gp from the global list so that + * no associations can be made while we are releasing + * struct t10_alua_lu_gp. + */ + spin_lock(&se_global->lu_gps_lock); + atomic_set(&lu_gp->lu_gp_shutdown, 1); + list_del(&lu_gp->lu_gp_list); + se_global->alua_lu_gps_count--; + spin_unlock(&se_global->lu_gps_lock); + /* + * Allow struct t10_alua_lu_gp * referenced by core_alua_get_lu_gp_by_name() + * in target_core_configfs.c:target_core_store_alua_lu_gp() to be + * released with core_alua_put_lu_gp_from_name() + */ + while (atomic_read(&lu_gp->lu_gp_ref_cnt)) + cpu_relax(); + /* + * Release reference to struct t10_alua_lu_gp * from all associated + * struct se_device. + */ + spin_lock(&lu_gp->lu_gp_lock); + list_for_each_entry_safe(lu_gp_mem, lu_gp_mem_tmp, + &lu_gp->lu_gp_mem_list, lu_gp_mem_list) { + if (lu_gp_mem->lu_gp_assoc) { + list_del(&lu_gp_mem->lu_gp_mem_list); + lu_gp->lu_gp_members--; + lu_gp_mem->lu_gp_assoc = 0; + } + spin_unlock(&lu_gp->lu_gp_lock); + /* + * + * lu_gp_mem is assoicated with a single + * struct se_device->dev_alua_lu_gp_mem, and is released when + * struct se_device is released via core_alua_free_lu_gp_mem(). + * + * If the passed lu_gp does NOT match the default_lu_gp, assume + * we want to re-assocate a given lu_gp_mem with default_lu_gp. + */ + spin_lock(&lu_gp_mem->lu_gp_mem_lock); + if (lu_gp != se_global->default_lu_gp) + __core_alua_attach_lu_gp_mem(lu_gp_mem, + se_global->default_lu_gp); + else + lu_gp_mem->lu_gp = NULL; + spin_unlock(&lu_gp_mem->lu_gp_mem_lock); + + spin_lock(&lu_gp->lu_gp_lock); + } + spin_unlock(&lu_gp->lu_gp_lock); + + kmem_cache_free(t10_alua_lu_gp_cache, lu_gp); +} + +void core_alua_free_lu_gp_mem(struct se_device *dev) +{ + struct se_subsystem_dev *su_dev = dev->se_sub_dev; + struct t10_alua *alua = T10_ALUA(su_dev); + struct t10_alua_lu_gp *lu_gp; + struct t10_alua_lu_gp_member *lu_gp_mem; + + if (alua->alua_type != SPC3_ALUA_EMULATED) + return; + + lu_gp_mem = dev->dev_alua_lu_gp_mem; + if (!(lu_gp_mem)) + return; + + while (atomic_read(&lu_gp_mem->lu_gp_mem_ref_cnt)) + cpu_relax(); + + spin_lock(&lu_gp_mem->lu_gp_mem_lock); + lu_gp = lu_gp_mem->lu_gp; + if ((lu_gp)) { + spin_lock(&lu_gp->lu_gp_lock); + if (lu_gp_mem->lu_gp_assoc) { + list_del(&lu_gp_mem->lu_gp_mem_list); + lu_gp->lu_gp_members--; + lu_gp_mem->lu_gp_assoc = 0; + } + spin_unlock(&lu_gp->lu_gp_lock); + lu_gp_mem->lu_gp = NULL; + } + spin_unlock(&lu_gp_mem->lu_gp_mem_lock); + + kmem_cache_free(t10_alua_lu_gp_mem_cache, lu_gp_mem); +} + +struct t10_alua_lu_gp *core_alua_get_lu_gp_by_name(const char *name) +{ + struct t10_alua_lu_gp *lu_gp; + struct config_item *ci; + + spin_lock(&se_global->lu_gps_lock); + list_for_each_entry(lu_gp, &se_global->g_lu_gps_list, lu_gp_list) { + if (!(lu_gp->lu_gp_valid_id)) + continue; + ci = &lu_gp->lu_gp_group.cg_item; + if (!(strcmp(config_item_name(ci), name))) { + atomic_inc(&lu_gp->lu_gp_ref_cnt); + spin_unlock(&se_global->lu_gps_lock); + return lu_gp; + } + } + spin_unlock(&se_global->lu_gps_lock); + + return NULL; +} + +void core_alua_put_lu_gp_from_name(struct t10_alua_lu_gp *lu_gp) +{ + spin_lock(&se_global->lu_gps_lock); + atomic_dec(&lu_gp->lu_gp_ref_cnt); + spin_unlock(&se_global->lu_gps_lock); +} + +/* + * Called with struct t10_alua_lu_gp_member->lu_gp_mem_lock + */ +void __core_alua_attach_lu_gp_mem( + struct t10_alua_lu_gp_member *lu_gp_mem, + struct t10_alua_lu_gp *lu_gp) +{ + spin_lock(&lu_gp->lu_gp_lock); + lu_gp_mem->lu_gp = lu_gp; + lu_gp_mem->lu_gp_assoc = 1; + list_add_tail(&lu_gp_mem->lu_gp_mem_list, &lu_gp->lu_gp_mem_list); + lu_gp->lu_gp_members++; + spin_unlock(&lu_gp->lu_gp_lock); +} + +/* + * Called with struct t10_alua_lu_gp_member->lu_gp_mem_lock + */ +void __core_alua_drop_lu_gp_mem( + struct t10_alua_lu_gp_member *lu_gp_mem, + struct t10_alua_lu_gp *lu_gp) +{ + spin_lock(&lu_gp->lu_gp_lock); + list_del(&lu_gp_mem->lu_gp_mem_list); + lu_gp_mem->lu_gp = NULL; + lu_gp_mem->lu_gp_assoc = 0; + lu_gp->lu_gp_members--; + spin_unlock(&lu_gp->lu_gp_lock); +} + +struct t10_alua_tg_pt_gp *core_alua_allocate_tg_pt_gp( + struct se_subsystem_dev *su_dev, + const char *name, + int def_group) +{ + struct t10_alua_tg_pt_gp *tg_pt_gp; + + tg_pt_gp = kmem_cache_zalloc(t10_alua_tg_pt_gp_cache, GFP_KERNEL); + if (!(tg_pt_gp)) { + printk(KERN_ERR "Unable to allocate struct t10_alua_tg_pt_gp\n"); + return NULL; + } + INIT_LIST_HEAD(&tg_pt_gp->tg_pt_gp_list); + INIT_LIST_HEAD(&tg_pt_gp->tg_pt_gp_mem_list); + mutex_init(&tg_pt_gp->tg_pt_gp_md_mutex); + spin_lock_init(&tg_pt_gp->tg_pt_gp_lock); + atomic_set(&tg_pt_gp->tg_pt_gp_ref_cnt, 0); + tg_pt_gp->tg_pt_gp_su_dev = su_dev; + tg_pt_gp->tg_pt_gp_md_buf_len = ALUA_MD_BUF_LEN; + atomic_set(&tg_pt_gp->tg_pt_gp_alua_access_state, + ALUA_ACCESS_STATE_ACTIVE_OPTMIZED); + /* + * Enable both explict and implict ALUA support by default + */ + tg_pt_gp->tg_pt_gp_alua_access_type = + TPGS_EXPLICT_ALUA | TPGS_IMPLICT_ALUA; + /* + * Set the default Active/NonOptimized Delay in milliseconds + */ + tg_pt_gp->tg_pt_gp_nonop_delay_msecs = ALUA_DEFAULT_NONOP_DELAY_MSECS; + tg_pt_gp->tg_pt_gp_trans_delay_msecs = ALUA_DEFAULT_TRANS_DELAY_MSECS; + + if (def_group) { + spin_lock(&T10_ALUA(su_dev)->tg_pt_gps_lock); + tg_pt_gp->tg_pt_gp_id = + T10_ALUA(su_dev)->alua_tg_pt_gps_counter++; + tg_pt_gp->tg_pt_gp_valid_id = 1; + T10_ALUA(su_dev)->alua_tg_pt_gps_count++; + list_add_tail(&tg_pt_gp->tg_pt_gp_list, + &T10_ALUA(su_dev)->tg_pt_gps_list); + spin_unlock(&T10_ALUA(su_dev)->tg_pt_gps_lock); + } + + return tg_pt_gp; +} + +int core_alua_set_tg_pt_gp_id( + struct t10_alua_tg_pt_gp *tg_pt_gp, + u16 tg_pt_gp_id) +{ + struct se_subsystem_dev *su_dev = tg_pt_gp->tg_pt_gp_su_dev; + struct t10_alua_tg_pt_gp *tg_pt_gp_tmp; + u16 tg_pt_gp_id_tmp; + /* + * The tg_pt_gp->tg_pt_gp_id may only be set once.. + */ + if (tg_pt_gp->tg_pt_gp_valid_id) { + printk(KERN_WARNING "ALUA TG PT Group already has a valid ID," + " ignoring request\n"); + return -1; + } + + spin_lock(&T10_ALUA(su_dev)->tg_pt_gps_lock); + if (T10_ALUA(su_dev)->alua_tg_pt_gps_count == 0x0000ffff) { + printk(KERN_ERR "Maximum ALUA alua_tg_pt_gps_count:" + " 0x0000ffff reached\n"); + spin_unlock(&T10_ALUA(su_dev)->tg_pt_gps_lock); + kmem_cache_free(t10_alua_tg_pt_gp_cache, tg_pt_gp); + return -1; + } +again: + tg_pt_gp_id_tmp = (tg_pt_gp_id != 0) ? tg_pt_gp_id : + T10_ALUA(su_dev)->alua_tg_pt_gps_counter++; + + list_for_each_entry(tg_pt_gp_tmp, &T10_ALUA(su_dev)->tg_pt_gps_list, + tg_pt_gp_list) { + if (tg_pt_gp_tmp->tg_pt_gp_id == tg_pt_gp_id_tmp) { + if (!(tg_pt_gp_id)) + goto again; + + printk(KERN_ERR "ALUA Target Port Group ID: %hu already" + " exists, ignoring request\n", tg_pt_gp_id); + spin_unlock(&T10_ALUA(su_dev)->tg_pt_gps_lock); + return -1; + } + } + + tg_pt_gp->tg_pt_gp_id = tg_pt_gp_id_tmp; + tg_pt_gp->tg_pt_gp_valid_id = 1; + list_add_tail(&tg_pt_gp->tg_pt_gp_list, + &T10_ALUA(su_dev)->tg_pt_gps_list); + T10_ALUA(su_dev)->alua_tg_pt_gps_count++; + spin_unlock(&T10_ALUA(su_dev)->tg_pt_gps_lock); + + return 0; +} + +struct t10_alua_tg_pt_gp_member *core_alua_allocate_tg_pt_gp_mem( + struct se_port *port) +{ + struct t10_alua_tg_pt_gp_member *tg_pt_gp_mem; + + tg_pt_gp_mem = kmem_cache_zalloc(t10_alua_tg_pt_gp_mem_cache, + GFP_KERNEL); + if (!(tg_pt_gp_mem)) { + printk(KERN_ERR "Unable to allocate struct t10_alua_tg_pt_gp_member\n"); + return ERR_PTR(-ENOMEM); + } + INIT_LIST_HEAD(&tg_pt_gp_mem->tg_pt_gp_mem_list); + spin_lock_init(&tg_pt_gp_mem->tg_pt_gp_mem_lock); + atomic_set(&tg_pt_gp_mem->tg_pt_gp_mem_ref_cnt, 0); + + tg_pt_gp_mem->tg_pt = port; + port->sep_alua_tg_pt_gp_mem = tg_pt_gp_mem; + atomic_set(&port->sep_tg_pt_gp_active, 1); + + return tg_pt_gp_mem; +} + +void core_alua_free_tg_pt_gp( + struct t10_alua_tg_pt_gp *tg_pt_gp) +{ + struct se_subsystem_dev *su_dev = tg_pt_gp->tg_pt_gp_su_dev; + struct t10_alua_tg_pt_gp_member *tg_pt_gp_mem, *tg_pt_gp_mem_tmp; + /* + * Once we have reached this point, config_item_put() has already + * been called from target_core_alua_drop_tg_pt_gp(). + * + * Here we remove *tg_pt_gp from the global list so that + * no assications *OR* explict ALUA via SET_TARGET_PORT_GROUPS + * can be made while we are releasing struct t10_alua_tg_pt_gp. + */ + spin_lock(&T10_ALUA(su_dev)->tg_pt_gps_lock); + list_del(&tg_pt_gp->tg_pt_gp_list); + T10_ALUA(su_dev)->alua_tg_pt_gps_counter--; + spin_unlock(&T10_ALUA(su_dev)->tg_pt_gps_lock); + /* + * Allow a struct t10_alua_tg_pt_gp_member * referenced by + * core_alua_get_tg_pt_gp_by_name() in + * target_core_configfs.c:target_core_store_alua_tg_pt_gp() + * to be released with core_alua_put_tg_pt_gp_from_name(). + */ + while (atomic_read(&tg_pt_gp->tg_pt_gp_ref_cnt)) + cpu_relax(); + /* + * Release reference to struct t10_alua_tg_pt_gp from all associated + * struct se_port. + */ + spin_lock(&tg_pt_gp->tg_pt_gp_lock); + list_for_each_entry_safe(tg_pt_gp_mem, tg_pt_gp_mem_tmp, + &tg_pt_gp->tg_pt_gp_mem_list, tg_pt_gp_mem_list) { + if (tg_pt_gp_mem->tg_pt_gp_assoc) { + list_del(&tg_pt_gp_mem->tg_pt_gp_mem_list); + tg_pt_gp->tg_pt_gp_members--; + tg_pt_gp_mem->tg_pt_gp_assoc = 0; + } + spin_unlock(&tg_pt_gp->tg_pt_gp_lock); + /* + * tg_pt_gp_mem is assoicated with a single + * se_port->sep_alua_tg_pt_gp_mem, and is released via + * core_alua_free_tg_pt_gp_mem(). + * + * If the passed tg_pt_gp does NOT match the default_tg_pt_gp, + * assume we want to re-assocate a given tg_pt_gp_mem with + * default_tg_pt_gp. + */ + spin_lock(&tg_pt_gp_mem->tg_pt_gp_mem_lock); + if (tg_pt_gp != T10_ALUA(su_dev)->default_tg_pt_gp) { + __core_alua_attach_tg_pt_gp_mem(tg_pt_gp_mem, + T10_ALUA(su_dev)->default_tg_pt_gp); + } else + tg_pt_gp_mem->tg_pt_gp = NULL; + spin_unlock(&tg_pt_gp_mem->tg_pt_gp_mem_lock); + + spin_lock(&tg_pt_gp->tg_pt_gp_lock); + } + spin_unlock(&tg_pt_gp->tg_pt_gp_lock); + + kmem_cache_free(t10_alua_tg_pt_gp_cache, tg_pt_gp); +} + +void core_alua_free_tg_pt_gp_mem(struct se_port *port) +{ + struct se_subsystem_dev *su_dev = port->sep_lun->lun_se_dev->se_sub_dev; + struct t10_alua *alua = T10_ALUA(su_dev); + struct t10_alua_tg_pt_gp *tg_pt_gp; + struct t10_alua_tg_pt_gp_member *tg_pt_gp_mem; + + if (alua->alua_type != SPC3_ALUA_EMULATED) + return; + + tg_pt_gp_mem = port->sep_alua_tg_pt_gp_mem; + if (!(tg_pt_gp_mem)) + return; + + while (atomic_read(&tg_pt_gp_mem->tg_pt_gp_mem_ref_cnt)) + cpu_relax(); + + spin_lock(&tg_pt_gp_mem->tg_pt_gp_mem_lock); + tg_pt_gp = tg_pt_gp_mem->tg_pt_gp; + if ((tg_pt_gp)) { + spin_lock(&tg_pt_gp->tg_pt_gp_lock); + if (tg_pt_gp_mem->tg_pt_gp_assoc) { + list_del(&tg_pt_gp_mem->tg_pt_gp_mem_list); + tg_pt_gp->tg_pt_gp_members--; + tg_pt_gp_mem->tg_pt_gp_assoc = 0; + } + spin_unlock(&tg_pt_gp->tg_pt_gp_lock); + tg_pt_gp_mem->tg_pt_gp = NULL; + } + spin_unlock(&tg_pt_gp_mem->tg_pt_gp_mem_lock); + + kmem_cache_free(t10_alua_tg_pt_gp_mem_cache, tg_pt_gp_mem); +} + +static struct t10_alua_tg_pt_gp *core_alua_get_tg_pt_gp_by_name( + struct se_subsystem_dev *su_dev, + const char *name) +{ + struct t10_alua_tg_pt_gp *tg_pt_gp; + struct config_item *ci; + + spin_lock(&T10_ALUA(su_dev)->tg_pt_gps_lock); + list_for_each_entry(tg_pt_gp, &T10_ALUA(su_dev)->tg_pt_gps_list, + tg_pt_gp_list) { + if (!(tg_pt_gp->tg_pt_gp_valid_id)) + continue; + ci = &tg_pt_gp->tg_pt_gp_group.cg_item; + if (!(strcmp(config_item_name(ci), name))) { + atomic_inc(&tg_pt_gp->tg_pt_gp_ref_cnt); + spin_unlock(&T10_ALUA(su_dev)->tg_pt_gps_lock); + return tg_pt_gp; + } + } + spin_unlock(&T10_ALUA(su_dev)->tg_pt_gps_lock); + + return NULL; +} + +static void core_alua_put_tg_pt_gp_from_name( + struct t10_alua_tg_pt_gp *tg_pt_gp) +{ + struct se_subsystem_dev *su_dev = tg_pt_gp->tg_pt_gp_su_dev; + + spin_lock(&T10_ALUA(su_dev)->tg_pt_gps_lock); + atomic_dec(&tg_pt_gp->tg_pt_gp_ref_cnt); + spin_unlock(&T10_ALUA(su_dev)->tg_pt_gps_lock); +} + +/* + * Called with struct t10_alua_tg_pt_gp_member->tg_pt_gp_mem_lock held + */ +void __core_alua_attach_tg_pt_gp_mem( + struct t10_alua_tg_pt_gp_member *tg_pt_gp_mem, + struct t10_alua_tg_pt_gp *tg_pt_gp) +{ + spin_lock(&tg_pt_gp->tg_pt_gp_lock); + tg_pt_gp_mem->tg_pt_gp = tg_pt_gp; + tg_pt_gp_mem->tg_pt_gp_assoc = 1; + list_add_tail(&tg_pt_gp_mem->tg_pt_gp_mem_list, + &tg_pt_gp->tg_pt_gp_mem_list); + tg_pt_gp->tg_pt_gp_members++; + spin_unlock(&tg_pt_gp->tg_pt_gp_lock); +} + +/* + * Called with struct t10_alua_tg_pt_gp_member->tg_pt_gp_mem_lock held + */ +static void __core_alua_drop_tg_pt_gp_mem( + struct t10_alua_tg_pt_gp_member *tg_pt_gp_mem, + struct t10_alua_tg_pt_gp *tg_pt_gp) +{ + spin_lock(&tg_pt_gp->tg_pt_gp_lock); + list_del(&tg_pt_gp_mem->tg_pt_gp_mem_list); + tg_pt_gp_mem->tg_pt_gp = NULL; + tg_pt_gp_mem->tg_pt_gp_assoc = 0; + tg_pt_gp->tg_pt_gp_members--; + spin_unlock(&tg_pt_gp->tg_pt_gp_lock); +} + +ssize_t core_alua_show_tg_pt_gp_info(struct se_port *port, char *page) +{ + struct se_subsystem_dev *su_dev = port->sep_lun->lun_se_dev->se_sub_dev; + struct config_item *tg_pt_ci; + struct t10_alua *alua = T10_ALUA(su_dev); + struct t10_alua_tg_pt_gp *tg_pt_gp; + struct t10_alua_tg_pt_gp_member *tg_pt_gp_mem; + ssize_t len = 0; + + if (alua->alua_type != SPC3_ALUA_EMULATED) + return len; + + tg_pt_gp_mem = port->sep_alua_tg_pt_gp_mem; + if (!(tg_pt_gp_mem)) + return len; + + spin_lock(&tg_pt_gp_mem->tg_pt_gp_mem_lock); + tg_pt_gp = tg_pt_gp_mem->tg_pt_gp; + if ((tg_pt_gp)) { + tg_pt_ci = &tg_pt_gp->tg_pt_gp_group.cg_item; + len += sprintf(page, "TG Port Alias: %s\nTG Port Group ID:" + " %hu\nTG Port Primary Access State: %s\nTG Port " + "Primary Access Status: %s\nTG Port Secondary Access" + " State: %s\nTG Port Secondary Access Status: %s\n", + config_item_name(tg_pt_ci), tg_pt_gp->tg_pt_gp_id, + core_alua_dump_state(atomic_read( + &tg_pt_gp->tg_pt_gp_alua_access_state)), + core_alua_dump_status( + tg_pt_gp->tg_pt_gp_alua_access_status), + (atomic_read(&port->sep_tg_pt_secondary_offline)) ? + "Offline" : "None", + core_alua_dump_status(port->sep_tg_pt_secondary_stat)); + } + spin_unlock(&tg_pt_gp_mem->tg_pt_gp_mem_lock); + + return len; +} + +ssize_t core_alua_store_tg_pt_gp_info( + struct se_port *port, + const char *page, + size_t count) +{ + struct se_portal_group *tpg; + struct se_lun *lun; + struct se_subsystem_dev *su_dev = port->sep_lun->lun_se_dev->se_sub_dev; + struct t10_alua_tg_pt_gp *tg_pt_gp = NULL, *tg_pt_gp_new = NULL; + struct t10_alua_tg_pt_gp_member *tg_pt_gp_mem; + unsigned char buf[TG_PT_GROUP_NAME_BUF]; + int move = 0; + + tpg = port->sep_tpg; + lun = port->sep_lun; + + if (T10_ALUA(su_dev)->alua_type != SPC3_ALUA_EMULATED) { + printk(KERN_WARNING "SPC3_ALUA_EMULATED not enabled for" + " %s/tpgt_%hu/%s\n", TPG_TFO(tpg)->tpg_get_wwn(tpg), + TPG_TFO(tpg)->tpg_get_tag(tpg), + config_item_name(&lun->lun_group.cg_item)); + return -EINVAL; + } + + if (count > TG_PT_GROUP_NAME_BUF) { + printk(KERN_ERR "ALUA Target Port Group alias too large!\n"); + return -EINVAL; + } + memset(buf, 0, TG_PT_GROUP_NAME_BUF); + memcpy(buf, page, count); + /* + * Any ALUA target port group alias besides "NULL" means we will be + * making a new group association. + */ + if (strcmp(strstrip(buf), "NULL")) { + /* + * core_alua_get_tg_pt_gp_by_name() will increment reference to + * struct t10_alua_tg_pt_gp. This reference is released with + * core_alua_put_tg_pt_gp_from_name() below. + */ + tg_pt_gp_new = core_alua_get_tg_pt_gp_by_name(su_dev, + strstrip(buf)); + if (!(tg_pt_gp_new)) + return -ENODEV; + } + tg_pt_gp_mem = port->sep_alua_tg_pt_gp_mem; + if (!(tg_pt_gp_mem)) { + if (tg_pt_gp_new) + core_alua_put_tg_pt_gp_from_name(tg_pt_gp_new); + printk(KERN_ERR "NULL struct se_port->sep_alua_tg_pt_gp_mem pointer\n"); + return -EINVAL; + } + + spin_lock(&tg_pt_gp_mem->tg_pt_gp_mem_lock); + tg_pt_gp = tg_pt_gp_mem->tg_pt_gp; + if ((tg_pt_gp)) { + /* + * Clearing an existing tg_pt_gp association, and replacing + * with the default_tg_pt_gp. + */ + if (!(tg_pt_gp_new)) { + printk(KERN_INFO "Target_Core_ConfigFS: Moving" + " %s/tpgt_%hu/%s from ALUA Target Port Group:" + " alua/%s, ID: %hu back to" + " default_tg_pt_gp\n", + TPG_TFO(tpg)->tpg_get_wwn(tpg), + TPG_TFO(tpg)->tpg_get_tag(tpg), + config_item_name(&lun->lun_group.cg_item), + config_item_name( + &tg_pt_gp->tg_pt_gp_group.cg_item), + tg_pt_gp->tg_pt_gp_id); + + __core_alua_drop_tg_pt_gp_mem(tg_pt_gp_mem, tg_pt_gp); + __core_alua_attach_tg_pt_gp_mem(tg_pt_gp_mem, + T10_ALUA(su_dev)->default_tg_pt_gp); + spin_unlock(&tg_pt_gp_mem->tg_pt_gp_mem_lock); + + return count; + } + /* + * Removing existing association of tg_pt_gp_mem with tg_pt_gp + */ + __core_alua_drop_tg_pt_gp_mem(tg_pt_gp_mem, tg_pt_gp); + move = 1; + } + /* + * Associate tg_pt_gp_mem with tg_pt_gp_new. + */ + __core_alua_attach_tg_pt_gp_mem(tg_pt_gp_mem, tg_pt_gp_new); + spin_unlock(&tg_pt_gp_mem->tg_pt_gp_mem_lock); + printk(KERN_INFO "Target_Core_ConfigFS: %s %s/tpgt_%hu/%s to ALUA" + " Target Port Group: alua/%s, ID: %hu\n", (move) ? + "Moving" : "Adding", TPG_TFO(tpg)->tpg_get_wwn(tpg), + TPG_TFO(tpg)->tpg_get_tag(tpg), + config_item_name(&lun->lun_group.cg_item), + config_item_name(&tg_pt_gp_new->tg_pt_gp_group.cg_item), + tg_pt_gp_new->tg_pt_gp_id); + + core_alua_put_tg_pt_gp_from_name(tg_pt_gp_new); + return count; +} + +ssize_t core_alua_show_access_type( + struct t10_alua_tg_pt_gp *tg_pt_gp, + char *page) +{ + if ((tg_pt_gp->tg_pt_gp_alua_access_type & TPGS_EXPLICT_ALUA) && + (tg_pt_gp->tg_pt_gp_alua_access_type & TPGS_IMPLICT_ALUA)) + return sprintf(page, "Implict and Explict\n"); + else if (tg_pt_gp->tg_pt_gp_alua_access_type & TPGS_IMPLICT_ALUA) + return sprintf(page, "Implict\n"); + else if (tg_pt_gp->tg_pt_gp_alua_access_type & TPGS_EXPLICT_ALUA) + return sprintf(page, "Explict\n"); + else + return sprintf(page, "None\n"); +} + +ssize_t core_alua_store_access_type( + struct t10_alua_tg_pt_gp *tg_pt_gp, + const char *page, + size_t count) +{ + unsigned long tmp; + int ret; + + ret = strict_strtoul(page, 0, &tmp); + if (ret < 0) { + printk(KERN_ERR "Unable to extract alua_access_type\n"); + return -EINVAL; + } + if ((tmp != 0) && (tmp != 1) && (tmp != 2) && (tmp != 3)) { + printk(KERN_ERR "Illegal value for alua_access_type:" + " %lu\n", tmp); + return -EINVAL; + } + if (tmp == 3) + tg_pt_gp->tg_pt_gp_alua_access_type = + TPGS_IMPLICT_ALUA | TPGS_EXPLICT_ALUA; + else if (tmp == 2) + tg_pt_gp->tg_pt_gp_alua_access_type = TPGS_EXPLICT_ALUA; + else if (tmp == 1) + tg_pt_gp->tg_pt_gp_alua_access_type = TPGS_IMPLICT_ALUA; + else + tg_pt_gp->tg_pt_gp_alua_access_type = 0; + + return count; +} + +ssize_t core_alua_show_nonop_delay_msecs( + struct t10_alua_tg_pt_gp *tg_pt_gp, + char *page) +{ + return sprintf(page, "%d\n", tg_pt_gp->tg_pt_gp_nonop_delay_msecs); +} + +ssize_t core_alua_store_nonop_delay_msecs( + struct t10_alua_tg_pt_gp *tg_pt_gp, + const char *page, + size_t count) +{ + unsigned long tmp; + int ret; + + ret = strict_strtoul(page, 0, &tmp); + if (ret < 0) { + printk(KERN_ERR "Unable to extract nonop_delay_msecs\n"); + return -EINVAL; + } + if (tmp > ALUA_MAX_NONOP_DELAY_MSECS) { + printk(KERN_ERR "Passed nonop_delay_msecs: %lu, exceeds" + " ALUA_MAX_NONOP_DELAY_MSECS: %d\n", tmp, + ALUA_MAX_NONOP_DELAY_MSECS); + return -EINVAL; + } + tg_pt_gp->tg_pt_gp_nonop_delay_msecs = (int)tmp; + + return count; +} + +ssize_t core_alua_show_trans_delay_msecs( + struct t10_alua_tg_pt_gp *tg_pt_gp, + char *page) +{ + return sprintf(page, "%d\n", tg_pt_gp->tg_pt_gp_trans_delay_msecs); +} + +ssize_t core_alua_store_trans_delay_msecs( + struct t10_alua_tg_pt_gp *tg_pt_gp, + const char *page, + size_t count) +{ + unsigned long tmp; + int ret; + + ret = strict_strtoul(page, 0, &tmp); + if (ret < 0) { + printk(KERN_ERR "Unable to extract trans_delay_msecs\n"); + return -EINVAL; + } + if (tmp > ALUA_MAX_TRANS_DELAY_MSECS) { + printk(KERN_ERR "Passed trans_delay_msecs: %lu, exceeds" + " ALUA_MAX_TRANS_DELAY_MSECS: %d\n", tmp, + ALUA_MAX_TRANS_DELAY_MSECS); + return -EINVAL; + } + tg_pt_gp->tg_pt_gp_trans_delay_msecs = (int)tmp; + + return count; +} + +ssize_t core_alua_show_preferred_bit( + struct t10_alua_tg_pt_gp *tg_pt_gp, + char *page) +{ + return sprintf(page, "%d\n", tg_pt_gp->tg_pt_gp_pref); +} + +ssize_t core_alua_store_preferred_bit( + struct t10_alua_tg_pt_gp *tg_pt_gp, + const char *page, + size_t count) +{ + unsigned long tmp; + int ret; + + ret = strict_strtoul(page, 0, &tmp); + if (ret < 0) { + printk(KERN_ERR "Unable to extract preferred ALUA value\n"); + return -EINVAL; + } + if ((tmp != 0) && (tmp != 1)) { + printk(KERN_ERR "Illegal value for preferred ALUA: %lu\n", tmp); + return -EINVAL; + } + tg_pt_gp->tg_pt_gp_pref = (int)tmp; + + return count; +} + +ssize_t core_alua_show_offline_bit(struct se_lun *lun, char *page) +{ + if (!(lun->lun_sep)) + return -ENODEV; + + return sprintf(page, "%d\n", + atomic_read(&lun->lun_sep->sep_tg_pt_secondary_offline)); +} + +ssize_t core_alua_store_offline_bit( + struct se_lun *lun, + const char *page, + size_t count) +{ + struct t10_alua_tg_pt_gp_member *tg_pt_gp_mem; + unsigned long tmp; + int ret; + + if (!(lun->lun_sep)) + return -ENODEV; + + ret = strict_strtoul(page, 0, &tmp); + if (ret < 0) { + printk(KERN_ERR "Unable to extract alua_tg_pt_offline value\n"); + return -EINVAL; + } + if ((tmp != 0) && (tmp != 1)) { + printk(KERN_ERR "Illegal value for alua_tg_pt_offline: %lu\n", + tmp); + return -EINVAL; + } + tg_pt_gp_mem = lun->lun_sep->sep_alua_tg_pt_gp_mem; + if (!(tg_pt_gp_mem)) { + printk(KERN_ERR "Unable to locate *tg_pt_gp_mem\n"); + return -EINVAL; + } + + ret = core_alua_set_tg_pt_secondary_state(tg_pt_gp_mem, + lun->lun_sep, 0, (int)tmp); + if (ret < 0) + return -EINVAL; + + return count; +} + +ssize_t core_alua_show_secondary_status( + struct se_lun *lun, + char *page) +{ + return sprintf(page, "%d\n", lun->lun_sep->sep_tg_pt_secondary_stat); +} + +ssize_t core_alua_store_secondary_status( + struct se_lun *lun, + const char *page, + size_t count) +{ + unsigned long tmp; + int ret; + + ret = strict_strtoul(page, 0, &tmp); + if (ret < 0) { + printk(KERN_ERR "Unable to extract alua_tg_pt_status\n"); + return -EINVAL; + } + if ((tmp != ALUA_STATUS_NONE) && + (tmp != ALUA_STATUS_ALTERED_BY_EXPLICT_STPG) && + (tmp != ALUA_STATUS_ALTERED_BY_IMPLICT_ALUA)) { + printk(KERN_ERR "Illegal value for alua_tg_pt_status: %lu\n", + tmp); + return -EINVAL; + } + lun->lun_sep->sep_tg_pt_secondary_stat = (int)tmp; + + return count; +} + +ssize_t core_alua_show_secondary_write_metadata( + struct se_lun *lun, + char *page) +{ + return sprintf(page, "%d\n", + lun->lun_sep->sep_tg_pt_secondary_write_md); +} + +ssize_t core_alua_store_secondary_write_metadata( + struct se_lun *lun, + const char *page, + size_t count) +{ + unsigned long tmp; + int ret; + + ret = strict_strtoul(page, 0, &tmp); + if (ret < 0) { + printk(KERN_ERR "Unable to extract alua_tg_pt_write_md\n"); + return -EINVAL; + } + if ((tmp != 0) && (tmp != 1)) { + printk(KERN_ERR "Illegal value for alua_tg_pt_write_md:" + " %lu\n", tmp); + return -EINVAL; + } + lun->lun_sep->sep_tg_pt_secondary_write_md = (int)tmp; + + return count; +} + +int core_setup_alua(struct se_device *dev, int force_pt) +{ + struct se_subsystem_dev *su_dev = dev->se_sub_dev; + struct t10_alua *alua = T10_ALUA(su_dev); + struct t10_alua_lu_gp_member *lu_gp_mem; + /* + * If this device is from Target_Core_Mod/pSCSI, use the ALUA logic + * of the Underlying SCSI hardware. In Linux/SCSI terms, this can + * cause a problem because libata and some SATA RAID HBAs appear + * under Linux/SCSI, but emulate SCSI logic themselves. + */ + if (((TRANSPORT(dev)->transport_type == TRANSPORT_PLUGIN_PHBA_PDEV) && + !(DEV_ATTRIB(dev)->emulate_alua)) || force_pt) { + alua->alua_type = SPC_ALUA_PASSTHROUGH; + alua->alua_state_check = &core_alua_state_check_nop; + printk(KERN_INFO "%s: Using SPC_ALUA_PASSTHROUGH, no ALUA" + " emulation\n", TRANSPORT(dev)->name); + return 0; + } + /* + * If SPC-3 or above is reported by real or emulated struct se_device, + * use emulated ALUA. + */ + if (TRANSPORT(dev)->get_device_rev(dev) >= SCSI_3) { + printk(KERN_INFO "%s: Enabling ALUA Emulation for SPC-3" + " device\n", TRANSPORT(dev)->name); + /* + * Assoicate this struct se_device with the default ALUA + * LUN Group. + */ + lu_gp_mem = core_alua_allocate_lu_gp_mem(dev); + if (IS_ERR(lu_gp_mem) || !lu_gp_mem) + return -1; + + alua->alua_type = SPC3_ALUA_EMULATED; + alua->alua_state_check = &core_alua_state_check; + spin_lock(&lu_gp_mem->lu_gp_mem_lock); + __core_alua_attach_lu_gp_mem(lu_gp_mem, + se_global->default_lu_gp); + spin_unlock(&lu_gp_mem->lu_gp_mem_lock); + + printk(KERN_INFO "%s: Adding to default ALUA LU Group:" + " core/alua/lu_gps/default_lu_gp\n", + TRANSPORT(dev)->name); + } else { + alua->alua_type = SPC2_ALUA_DISABLED; + alua->alua_state_check = &core_alua_state_check_nop; + printk(KERN_INFO "%s: Disabling ALUA Emulation for SPC-2" + " device\n", TRANSPORT(dev)->name); + } + + return 0; +} diff --git a/drivers/target/target_core_alua.h b/drivers/target/target_core_alua.h new file mode 100644 index 000000000000..c86f97a081ed --- /dev/null +++ b/drivers/target/target_core_alua.h @@ -0,0 +1,126 @@ +#ifndef TARGET_CORE_ALUA_H +#define TARGET_CORE_ALUA_H + +/* + * INQUIRY response data, TPGS Field + * + * from spc4r17 section 6.4.2 Table 135 + */ +#define TPGS_NO_ALUA 0x00 +#define TPGS_IMPLICT_ALUA 0x10 +#define TPGS_EXPLICT_ALUA 0x20 + +/* + * ASYMMETRIC ACCESS STATE field + * + * from spc4r17 section 6.27 Table 245 + */ +#define ALUA_ACCESS_STATE_ACTIVE_OPTMIZED 0x0 +#define ALUA_ACCESS_STATE_ACTIVE_NON_OPTIMIZED 0x1 +#define ALUA_ACCESS_STATE_STANDBY 0x2 +#define ALUA_ACCESS_STATE_UNAVAILABLE 0x3 +#define ALUA_ACCESS_STATE_OFFLINE 0xe +#define ALUA_ACCESS_STATE_TRANSITION 0xf + +/* + * REPORT_TARGET_PORT_GROUP STATUS CODE + * + * from spc4r17 section 6.27 Table 246 + */ +#define ALUA_STATUS_NONE 0x00 +#define ALUA_STATUS_ALTERED_BY_EXPLICT_STPG 0x01 +#define ALUA_STATUS_ALTERED_BY_IMPLICT_ALUA 0x02 + +/* + * From spc4r17, Table D.1: ASC and ASCQ Assignement + */ +#define ASCQ_04H_ALUA_STATE_TRANSITION 0x0a +#define ASCQ_04H_ALUA_TG_PT_STANDBY 0x0b +#define ASCQ_04H_ALUA_TG_PT_UNAVAILABLE 0x0c +#define ASCQ_04H_ALUA_OFFLINE 0x12 + +/* + * Used as the default for Active/NonOptimized delay (in milliseconds) + * This can also be changed via configfs on a per target port group basis.. + */ +#define ALUA_DEFAULT_NONOP_DELAY_MSECS 100 +#define ALUA_MAX_NONOP_DELAY_MSECS 10000 /* 10 seconds */ +/* + * Used for implict and explict ALUA transitional delay, that is disabled + * by default, and is intended to be used for debugging client side ALUA code. + */ +#define ALUA_DEFAULT_TRANS_DELAY_MSECS 0 +#define ALUA_MAX_TRANS_DELAY_MSECS 30000 /* 30 seconds */ +/* + * Used by core_alua_update_tpg_primary_metadata() and + * core_alua_update_tpg_secondary_metadata() + */ +#define ALUA_METADATA_PATH_LEN 512 +/* + * Used by core_alua_update_tpg_secondary_metadata() + */ +#define ALUA_SECONDARY_METADATA_WWN_LEN 256 + +extern struct kmem_cache *t10_alua_lu_gp_cache; +extern struct kmem_cache *t10_alua_lu_gp_mem_cache; +extern struct kmem_cache *t10_alua_tg_pt_gp_cache; +extern struct kmem_cache *t10_alua_tg_pt_gp_mem_cache; + +extern int core_emulate_report_target_port_groups(struct se_cmd *); +extern int core_emulate_set_target_port_groups(struct se_cmd *); +extern int core_alua_check_nonop_delay(struct se_cmd *); +extern int core_alua_do_port_transition(struct t10_alua_tg_pt_gp *, + struct se_device *, struct se_port *, + struct se_node_acl *, int, int); +extern char *core_alua_dump_status(int); +extern struct t10_alua_lu_gp *core_alua_allocate_lu_gp(const char *, int); +extern int core_alua_set_lu_gp_id(struct t10_alua_lu_gp *, u16); +extern void core_alua_free_lu_gp(struct t10_alua_lu_gp *); +extern void core_alua_free_lu_gp_mem(struct se_device *); +extern struct t10_alua_lu_gp *core_alua_get_lu_gp_by_name(const char *); +extern void core_alua_put_lu_gp_from_name(struct t10_alua_lu_gp *); +extern void __core_alua_attach_lu_gp_mem(struct t10_alua_lu_gp_member *, + struct t10_alua_lu_gp *); +extern void __core_alua_drop_lu_gp_mem(struct t10_alua_lu_gp_member *, + struct t10_alua_lu_gp *); +extern void core_alua_drop_lu_gp_dev(struct se_device *); +extern struct t10_alua_tg_pt_gp *core_alua_allocate_tg_pt_gp( + struct se_subsystem_dev *, const char *, int); +extern int core_alua_set_tg_pt_gp_id(struct t10_alua_tg_pt_gp *, u16); +extern struct t10_alua_tg_pt_gp_member *core_alua_allocate_tg_pt_gp_mem( + struct se_port *); +extern void core_alua_free_tg_pt_gp(struct t10_alua_tg_pt_gp *); +extern void core_alua_free_tg_pt_gp_mem(struct se_port *); +extern void __core_alua_attach_tg_pt_gp_mem(struct t10_alua_tg_pt_gp_member *, + struct t10_alua_tg_pt_gp *); +extern ssize_t core_alua_show_tg_pt_gp_info(struct se_port *, char *); +extern ssize_t core_alua_store_tg_pt_gp_info(struct se_port *, const char *, + size_t); +extern ssize_t core_alua_show_access_type(struct t10_alua_tg_pt_gp *, char *); +extern ssize_t core_alua_store_access_type(struct t10_alua_tg_pt_gp *, + const char *, size_t); +extern ssize_t core_alua_show_nonop_delay_msecs(struct t10_alua_tg_pt_gp *, + char *); +extern ssize_t core_alua_store_nonop_delay_msecs(struct t10_alua_tg_pt_gp *, + const char *, size_t); +extern ssize_t core_alua_show_trans_delay_msecs(struct t10_alua_tg_pt_gp *, + char *); +extern ssize_t core_alua_store_trans_delay_msecs(struct t10_alua_tg_pt_gp *, + const char *, size_t); +extern ssize_t core_alua_show_preferred_bit(struct t10_alua_tg_pt_gp *, + char *); +extern ssize_t core_alua_store_preferred_bit(struct t10_alua_tg_pt_gp *, + const char *, size_t); +extern ssize_t core_alua_show_offline_bit(struct se_lun *, char *); +extern ssize_t core_alua_store_offline_bit(struct se_lun *, const char *, + size_t); +extern ssize_t core_alua_show_secondary_status(struct se_lun *, char *); +extern ssize_t core_alua_store_secondary_status(struct se_lun *, + const char *, size_t); +extern ssize_t core_alua_show_secondary_write_metadata(struct se_lun *, + char *); +extern ssize_t core_alua_store_secondary_write_metadata(struct se_lun *, + const char *, size_t); +extern int core_setup_alua(struct se_device *, int); + +#endif /* TARGET_CORE_ALUA_H */ diff --git a/drivers/target/target_core_cdb.c b/drivers/target/target_core_cdb.c new file mode 100644 index 000000000000..366080baf474 --- /dev/null +++ b/drivers/target/target_core_cdb.c @@ -0,0 +1,1131 @@ +/* + * CDB emulation for non-READ/WRITE commands. + * + * Copyright (c) 2002, 2003, 2004, 2005 PyX Technologies, Inc. + * Copyright (c) 2005, 2006, 2007 SBE, Inc. + * Copyright (c) 2007-2010 Rising Tide Systems + * Copyright (c) 2008-2010 Linux-iSCSI.org + * + * Nicholas A. Bellinger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + */ + +#include +#include + +#include +#include +#include +#include "target_core_ua.h" + +static void +target_fill_alua_data(struct se_port *port, unsigned char *buf) +{ + struct t10_alua_tg_pt_gp *tg_pt_gp; + struct t10_alua_tg_pt_gp_member *tg_pt_gp_mem; + + /* + * Set SCCS for MAINTENANCE_IN + REPORT_TARGET_PORT_GROUPS. + */ + buf[5] = 0x80; + + /* + * Set TPGS field for explict and/or implict ALUA access type + * and opteration. + * + * See spc4r17 section 6.4.2 Table 135 + */ + if (!port) + return; + tg_pt_gp_mem = port->sep_alua_tg_pt_gp_mem; + if (!tg_pt_gp_mem) + return; + + spin_lock(&tg_pt_gp_mem->tg_pt_gp_mem_lock); + tg_pt_gp = tg_pt_gp_mem->tg_pt_gp; + if (tg_pt_gp) + buf[5] |= tg_pt_gp->tg_pt_gp_alua_access_type; + spin_unlock(&tg_pt_gp_mem->tg_pt_gp_mem_lock); +} + +static int +target_emulate_inquiry_std(struct se_cmd *cmd) +{ + struct se_lun *lun = SE_LUN(cmd); + struct se_device *dev = SE_DEV(cmd); + unsigned char *buf = cmd->t_task->t_task_buf; + + /* + * Make sure we at least have 6 bytes of INQUIRY response + * payload going back for EVPD=0 + */ + if (cmd->data_length < 6) { + printk(KERN_ERR "SCSI Inquiry payload length: %u" + " too small for EVPD=0\n", cmd->data_length); + return -1; + } + + buf[0] = dev->transport->get_device_type(dev); + if (buf[0] == TYPE_TAPE) + buf[1] = 0x80; + buf[2] = dev->transport->get_device_rev(dev); + + /* + * Enable SCCS and TPGS fields for Emulated ALUA + */ + if (T10_ALUA(dev->se_sub_dev)->alua_type == SPC3_ALUA_EMULATED) + target_fill_alua_data(lun->lun_sep, buf); + + if (cmd->data_length < 8) { + buf[4] = 1; /* Set additional length to 1 */ + return 0; + } + + buf[7] = 0x32; /* Sync=1 and CmdQue=1 */ + + /* + * Do not include vendor, product, reversion info in INQUIRY + * response payload for cdbs with a small allocation length. + */ + if (cmd->data_length < 36) { + buf[4] = 3; /* Set additional length to 3 */ + return 0; + } + + snprintf((unsigned char *)&buf[8], 8, "LIO-ORG"); + snprintf((unsigned char *)&buf[16], 16, "%s", + &DEV_T10_WWN(dev)->model[0]); + snprintf((unsigned char *)&buf[32], 4, "%s", + &DEV_T10_WWN(dev)->revision[0]); + buf[4] = 31; /* Set additional length to 31 */ + return 0; +} + +/* supported vital product data pages */ +static int +target_emulate_evpd_00(struct se_cmd *cmd, unsigned char *buf) +{ + buf[1] = 0x00; + if (cmd->data_length < 8) + return 0; + + buf[4] = 0x0; + /* + * Only report the INQUIRY EVPD=1 pages after a valid NAA + * Registered Extended LUN WWN has been set via ConfigFS + * during device creation/restart. + */ + if (SE_DEV(cmd)->se_sub_dev->su_dev_flags & + SDF_EMULATED_VPD_UNIT_SERIAL) { + buf[3] = 3; + buf[5] = 0x80; + buf[6] = 0x83; + buf[7] = 0x86; + } + + return 0; +} + +/* unit serial number */ +static int +target_emulate_evpd_80(struct se_cmd *cmd, unsigned char *buf) +{ + struct se_device *dev = SE_DEV(cmd); + u16 len = 0; + + buf[1] = 0x80; + if (dev->se_sub_dev->su_dev_flags & + SDF_EMULATED_VPD_UNIT_SERIAL) { + u32 unit_serial_len; + + unit_serial_len = + strlen(&DEV_T10_WWN(dev)->unit_serial[0]); + unit_serial_len++; /* For NULL Terminator */ + + if (((len + 4) + unit_serial_len) > cmd->data_length) { + len += unit_serial_len; + buf[2] = ((len >> 8) & 0xff); + buf[3] = (len & 0xff); + return 0; + } + len += sprintf((unsigned char *)&buf[4], "%s", + &DEV_T10_WWN(dev)->unit_serial[0]); + len++; /* Extra Byte for NULL Terminator */ + buf[3] = len; + } + return 0; +} + +/* + * Device identification VPD, for a complete list of + * DESIGNATOR TYPEs see spc4r17 Table 459. + */ +static int +target_emulate_evpd_83(struct se_cmd *cmd, unsigned char *buf) +{ + struct se_device *dev = SE_DEV(cmd); + struct se_lun *lun = SE_LUN(cmd); + struct se_port *port = NULL; + struct se_portal_group *tpg = NULL; + struct t10_alua_lu_gp_member *lu_gp_mem; + struct t10_alua_tg_pt_gp *tg_pt_gp; + struct t10_alua_tg_pt_gp_member *tg_pt_gp_mem; + unsigned char binary, binary_new; + unsigned char *prod = &DEV_T10_WWN(dev)->model[0]; + u32 prod_len; + u32 unit_serial_len, off = 0; + int i; + u16 len = 0, id_len; + + buf[1] = 0x83; + off = 4; + + /* + * NAA IEEE Registered Extended Assigned designator format, see + * spc4r17 section 7.7.3.6.5 + * + * We depend upon a target_core_mod/ConfigFS provided + * /sys/kernel/config/target/core/$HBA/$DEV/wwn/vpd_unit_serial + * value in order to return the NAA id. + */ + if (!(dev->se_sub_dev->su_dev_flags & SDF_EMULATED_VPD_UNIT_SERIAL)) + goto check_t10_vend_desc; + + if (off + 20 > cmd->data_length) + goto check_t10_vend_desc; + + /* CODE SET == Binary */ + buf[off++] = 0x1; + + /* Set ASSOICATION == addressed logical unit: 0)b */ + buf[off] = 0x00; + + /* Identifier/Designator type == NAA identifier */ + buf[off++] = 0x3; + off++; + + /* Identifier/Designator length */ + buf[off++] = 0x10; + + /* + * Start NAA IEEE Registered Extended Identifier/Designator + */ + buf[off++] = (0x6 << 4); + + /* + * Use OpenFabrics IEEE Company ID: 00 14 05 + */ + buf[off++] = 0x01; + buf[off++] = 0x40; + buf[off] = (0x5 << 4); + + /* + * Return ConfigFS Unit Serial Number information for + * VENDOR_SPECIFIC_IDENTIFIER and + * VENDOR_SPECIFIC_IDENTIFIER_EXTENTION + */ + binary = transport_asciihex_to_binaryhex( + &DEV_T10_WWN(dev)->unit_serial[0]); + buf[off++] |= (binary & 0xf0) >> 4; + for (i = 0; i < 24; i += 2) { + binary_new = transport_asciihex_to_binaryhex( + &DEV_T10_WWN(dev)->unit_serial[i+2]); + buf[off] = (binary & 0x0f) << 4; + buf[off++] |= (binary_new & 0xf0) >> 4; + binary = binary_new; + } + len = 20; + off = (len + 4); + +check_t10_vend_desc: + /* + * T10 Vendor Identifier Page, see spc4r17 section 7.7.3.4 + */ + id_len = 8; /* For Vendor field */ + prod_len = 4; /* For VPD Header */ + prod_len += 8; /* For Vendor field */ + prod_len += strlen(prod); + prod_len++; /* For : */ + + if (dev->se_sub_dev->su_dev_flags & + SDF_EMULATED_VPD_UNIT_SERIAL) { + unit_serial_len = + strlen(&DEV_T10_WWN(dev)->unit_serial[0]); + unit_serial_len++; /* For NULL Terminator */ + + if ((len + (id_len + 4) + + (prod_len + unit_serial_len)) > + cmd->data_length) { + len += (prod_len + unit_serial_len); + goto check_port; + } + id_len += sprintf((unsigned char *)&buf[off+12], + "%s:%s", prod, + &DEV_T10_WWN(dev)->unit_serial[0]); + } + buf[off] = 0x2; /* ASCII */ + buf[off+1] = 0x1; /* T10 Vendor ID */ + buf[off+2] = 0x0; + memcpy((unsigned char *)&buf[off+4], "LIO-ORG", 8); + /* Extra Byte for NULL Terminator */ + id_len++; + /* Identifier Length */ + buf[off+3] = id_len; + /* Header size for Designation descriptor */ + len += (id_len + 4); + off += (id_len + 4); + /* + * struct se_port is only set for INQUIRY VPD=1 through $FABRIC_MOD + */ +check_port: + port = lun->lun_sep; + if (port) { + struct t10_alua_lu_gp *lu_gp; + u32 padding, scsi_name_len; + u16 lu_gp_id = 0; + u16 tg_pt_gp_id = 0; + u16 tpgt; + + tpg = port->sep_tpg; + /* + * Relative target port identifer, see spc4r17 + * section 7.7.3.7 + * + * Get the PROTOCOL IDENTIFIER as defined by spc4r17 + * section 7.5.1 Table 362 + */ + if (((len + 4) + 8) > cmd->data_length) { + len += 8; + goto check_tpgi; + } + buf[off] = + (TPG_TFO(tpg)->get_fabric_proto_ident(tpg) << 4); + buf[off++] |= 0x1; /* CODE SET == Binary */ + buf[off] = 0x80; /* Set PIV=1 */ + /* Set ASSOICATION == target port: 01b */ + buf[off] |= 0x10; + /* DESIGNATOR TYPE == Relative target port identifer */ + buf[off++] |= 0x4; + off++; /* Skip over Reserved */ + buf[off++] = 4; /* DESIGNATOR LENGTH */ + /* Skip over Obsolete field in RTPI payload + * in Table 472 */ + off += 2; + buf[off++] = ((port->sep_rtpi >> 8) & 0xff); + buf[off++] = (port->sep_rtpi & 0xff); + len += 8; /* Header size + Designation descriptor */ + /* + * Target port group identifier, see spc4r17 + * section 7.7.3.8 + * + * Get the PROTOCOL IDENTIFIER as defined by spc4r17 + * section 7.5.1 Table 362 + */ +check_tpgi: + if (T10_ALUA(dev->se_sub_dev)->alua_type != + SPC3_ALUA_EMULATED) + goto check_scsi_name; + + if (((len + 4) + 8) > cmd->data_length) { + len += 8; + goto check_lu_gp; + } + tg_pt_gp_mem = port->sep_alua_tg_pt_gp_mem; + if (!tg_pt_gp_mem) + goto check_lu_gp; + + spin_lock(&tg_pt_gp_mem->tg_pt_gp_mem_lock); + tg_pt_gp = tg_pt_gp_mem->tg_pt_gp; + if (!(tg_pt_gp)) { + spin_unlock(&tg_pt_gp_mem->tg_pt_gp_mem_lock); + goto check_lu_gp; + } + tg_pt_gp_id = tg_pt_gp->tg_pt_gp_id; + spin_unlock(&tg_pt_gp_mem->tg_pt_gp_mem_lock); + + buf[off] = + (TPG_TFO(tpg)->get_fabric_proto_ident(tpg) << 4); + buf[off++] |= 0x1; /* CODE SET == Binary */ + buf[off] = 0x80; /* Set PIV=1 */ + /* Set ASSOICATION == target port: 01b */ + buf[off] |= 0x10; + /* DESIGNATOR TYPE == Target port group identifier */ + buf[off++] |= 0x5; + off++; /* Skip over Reserved */ + buf[off++] = 4; /* DESIGNATOR LENGTH */ + off += 2; /* Skip over Reserved Field */ + buf[off++] = ((tg_pt_gp_id >> 8) & 0xff); + buf[off++] = (tg_pt_gp_id & 0xff); + len += 8; /* Header size + Designation descriptor */ + /* + * Logical Unit Group identifier, see spc4r17 + * section 7.7.3.8 + */ +check_lu_gp: + if (((len + 4) + 8) > cmd->data_length) { + len += 8; + goto check_scsi_name; + } + lu_gp_mem = dev->dev_alua_lu_gp_mem; + if (!(lu_gp_mem)) + goto check_scsi_name; + + spin_lock(&lu_gp_mem->lu_gp_mem_lock); + lu_gp = lu_gp_mem->lu_gp; + if (!(lu_gp)) { + spin_unlock(&lu_gp_mem->lu_gp_mem_lock); + goto check_scsi_name; + } + lu_gp_id = lu_gp->lu_gp_id; + spin_unlock(&lu_gp_mem->lu_gp_mem_lock); + + buf[off++] |= 0x1; /* CODE SET == Binary */ + /* DESIGNATOR TYPE == Logical Unit Group identifier */ + buf[off++] |= 0x6; + off++; /* Skip over Reserved */ + buf[off++] = 4; /* DESIGNATOR LENGTH */ + off += 2; /* Skip over Reserved Field */ + buf[off++] = ((lu_gp_id >> 8) & 0xff); + buf[off++] = (lu_gp_id & 0xff); + len += 8; /* Header size + Designation descriptor */ + /* + * SCSI name string designator, see spc4r17 + * section 7.7.3.11 + * + * Get the PROTOCOL IDENTIFIER as defined by spc4r17 + * section 7.5.1 Table 362 + */ +check_scsi_name: + scsi_name_len = strlen(TPG_TFO(tpg)->tpg_get_wwn(tpg)); + /* UTF-8 ",t,0x<16-bit TPGT>" + NULL Terminator */ + scsi_name_len += 10; + /* Check for 4-byte padding */ + padding = ((-scsi_name_len) & 3); + if (padding != 0) + scsi_name_len += padding; + /* Header size + Designation descriptor */ + scsi_name_len += 4; + + if (((len + 4) + scsi_name_len) > cmd->data_length) { + len += scsi_name_len; + goto set_len; + } + buf[off] = + (TPG_TFO(tpg)->get_fabric_proto_ident(tpg) << 4); + buf[off++] |= 0x3; /* CODE SET == UTF-8 */ + buf[off] = 0x80; /* Set PIV=1 */ + /* Set ASSOICATION == target port: 01b */ + buf[off] |= 0x10; + /* DESIGNATOR TYPE == SCSI name string */ + buf[off++] |= 0x8; + off += 2; /* Skip over Reserved and length */ + /* + * SCSI name string identifer containing, $FABRIC_MOD + * dependent information. For LIO-Target and iSCSI + * Target Port, this means ",t,0x in + * UTF-8 encoding. + */ + tpgt = TPG_TFO(tpg)->tpg_get_tag(tpg); + scsi_name_len = sprintf(&buf[off], "%s,t,0x%04x", + TPG_TFO(tpg)->tpg_get_wwn(tpg), tpgt); + scsi_name_len += 1 /* Include NULL terminator */; + /* + * The null-terminated, null-padded (see 4.4.2) SCSI + * NAME STRING field contains a UTF-8 format string. + * The number of bytes in the SCSI NAME STRING field + * (i.e., the value in the DESIGNATOR LENGTH field) + * shall be no larger than 256 and shall be a multiple + * of four. + */ + if (padding) + scsi_name_len += padding; + + buf[off-1] = scsi_name_len; + off += scsi_name_len; + /* Header size + Designation descriptor */ + len += (scsi_name_len + 4); + } +set_len: + buf[2] = ((len >> 8) & 0xff); + buf[3] = (len & 0xff); /* Page Length for VPD 0x83 */ + return 0; +} + +/* Extended INQUIRY Data VPD Page */ +static int +target_emulate_evpd_86(struct se_cmd *cmd, unsigned char *buf) +{ + if (cmd->data_length < 60) + return 0; + + buf[1] = 0x86; + buf[2] = 0x3c; + /* Set HEADSUP, ORDSUP, SIMPSUP */ + buf[5] = 0x07; + + /* If WriteCache emulation is enabled, set V_SUP */ + if (DEV_ATTRIB(SE_DEV(cmd))->emulate_write_cache > 0) + buf[6] = 0x01; + return 0; +} + +/* Block Limits VPD page */ +static int +target_emulate_evpd_b0(struct se_cmd *cmd, unsigned char *buf) +{ + struct se_device *dev = SE_DEV(cmd); + int have_tp = 0; + + /* + * Following sbc3r22 section 6.5.3 Block Limits VPD page, when + * emulate_tpu=1 or emulate_tpws=1 we will be expect a + * different page length for Thin Provisioning. + */ + if (DEV_ATTRIB(dev)->emulate_tpu || DEV_ATTRIB(dev)->emulate_tpws) + have_tp = 1; + + if (cmd->data_length < (0x10 + 4)) { + printk(KERN_INFO "Received data_length: %u" + " too small for EVPD 0xb0\n", + cmd->data_length); + return -1; + } + + if (have_tp && cmd->data_length < (0x3c + 4)) { + printk(KERN_INFO "Received data_length: %u" + " too small for TPE=1 EVPD 0xb0\n", + cmd->data_length); + have_tp = 0; + } + + buf[0] = dev->transport->get_device_type(dev); + buf[1] = 0xb0; + buf[3] = have_tp ? 0x3c : 0x10; + + /* + * Set OPTIMAL TRANSFER LENGTH GRANULARITY + */ + put_unaligned_be16(1, &buf[6]); + + /* + * Set MAXIMUM TRANSFER LENGTH + */ + put_unaligned_be32(DEV_ATTRIB(dev)->max_sectors, &buf[8]); + + /* + * Set OPTIMAL TRANSFER LENGTH + */ + put_unaligned_be32(DEV_ATTRIB(dev)->optimal_sectors, &buf[12]); + + /* + * Exit now if we don't support TP or the initiator sent a too + * short buffer. + */ + if (!have_tp || cmd->data_length < (0x3c + 4)) + return 0; + + /* + * Set MAXIMUM UNMAP LBA COUNT + */ + put_unaligned_be32(DEV_ATTRIB(dev)->max_unmap_lba_count, &buf[20]); + + /* + * Set MAXIMUM UNMAP BLOCK DESCRIPTOR COUNT + */ + put_unaligned_be32(DEV_ATTRIB(dev)->max_unmap_block_desc_count, + &buf[24]); + + /* + * Set OPTIMAL UNMAP GRANULARITY + */ + put_unaligned_be32(DEV_ATTRIB(dev)->unmap_granularity, &buf[28]); + + /* + * UNMAP GRANULARITY ALIGNMENT + */ + put_unaligned_be32(DEV_ATTRIB(dev)->unmap_granularity_alignment, + &buf[32]); + if (DEV_ATTRIB(dev)->unmap_granularity_alignment != 0) + buf[32] |= 0x80; /* Set the UGAVALID bit */ + + return 0; +} + +/* Thin Provisioning VPD */ +static int +target_emulate_evpd_b2(struct se_cmd *cmd, unsigned char *buf) +{ + struct se_device *dev = SE_DEV(cmd); + + /* + * From sbc3r22 section 6.5.4 Thin Provisioning VPD page: + * + * The PAGE LENGTH field is defined in SPC-4. If the DP bit is set to + * zero, then the page length shall be set to 0004h. If the DP bit + * is set to one, then the page length shall be set to the value + * defined in table 162. + */ + buf[0] = dev->transport->get_device_type(dev); + buf[1] = 0xb2; + + /* + * Set Hardcoded length mentioned above for DP=0 + */ + put_unaligned_be16(0x0004, &buf[2]); + + /* + * The THRESHOLD EXPONENT field indicates the threshold set size in + * LBAs as a power of 2 (i.e., the threshold set size is equal to + * 2(threshold exponent)). + * + * Note that this is currently set to 0x00 as mkp says it will be + * changing again. We can enable this once it has settled in T10 + * and is actually used by Linux/SCSI ML code. + */ + buf[4] = 0x00; + + /* + * A TPU bit set to one indicates that the device server supports + * the UNMAP command (see 5.25). A TPU bit set to zero indicates + * that the device server does not support the UNMAP command. + */ + if (DEV_ATTRIB(dev)->emulate_tpu != 0) + buf[5] = 0x80; + + /* + * A TPWS bit set to one indicates that the device server supports + * the use of the WRITE SAME (16) command (see 5.42) to unmap LBAs. + * A TPWS bit set to zero indicates that the device server does not + * support the use of the WRITE SAME (16) command to unmap LBAs. + */ + if (DEV_ATTRIB(dev)->emulate_tpws != 0) + buf[5] |= 0x40; + + return 0; +} + +static int +target_emulate_inquiry(struct se_cmd *cmd) +{ + struct se_device *dev = SE_DEV(cmd); + unsigned char *buf = cmd->t_task->t_task_buf; + unsigned char *cdb = cmd->t_task->t_task_cdb; + + if (!(cdb[1] & 0x1)) + return target_emulate_inquiry_std(cmd); + + /* + * Make sure we at least have 4 bytes of INQUIRY response + * payload for 0x00 going back for EVPD=1. Note that 0x80 + * and 0x83 will check for enough payload data length and + * jump to set_len: label when there is not enough inquiry EVPD + * payload length left for the next outgoing EVPD metadata + */ + if (cmd->data_length < 4) { + printk(KERN_ERR "SCSI Inquiry payload length: %u" + " too small for EVPD=1\n", cmd->data_length); + return -1; + } + buf[0] = dev->transport->get_device_type(dev); + + switch (cdb[2]) { + case 0x00: + return target_emulate_evpd_00(cmd, buf); + case 0x80: + return target_emulate_evpd_80(cmd, buf); + case 0x83: + return target_emulate_evpd_83(cmd, buf); + case 0x86: + return target_emulate_evpd_86(cmd, buf); + case 0xb0: + return target_emulate_evpd_b0(cmd, buf); + case 0xb2: + return target_emulate_evpd_b2(cmd, buf); + default: + printk(KERN_ERR "Unknown VPD Code: 0x%02x\n", cdb[2]); + return -1; + } + + return 0; +} + +static int +target_emulate_readcapacity(struct se_cmd *cmd) +{ + struct se_device *dev = SE_DEV(cmd); + unsigned char *buf = cmd->t_task->t_task_buf; + u32 blocks = dev->transport->get_blocks(dev); + + buf[0] = (blocks >> 24) & 0xff; + buf[1] = (blocks >> 16) & 0xff; + buf[2] = (blocks >> 8) & 0xff; + buf[3] = blocks & 0xff; + buf[4] = (DEV_ATTRIB(dev)->block_size >> 24) & 0xff; + buf[5] = (DEV_ATTRIB(dev)->block_size >> 16) & 0xff; + buf[6] = (DEV_ATTRIB(dev)->block_size >> 8) & 0xff; + buf[7] = DEV_ATTRIB(dev)->block_size & 0xff; + /* + * Set max 32-bit blocks to signal SERVICE ACTION READ_CAPACITY_16 + */ + if (DEV_ATTRIB(dev)->emulate_tpu || DEV_ATTRIB(dev)->emulate_tpws) + put_unaligned_be32(0xFFFFFFFF, &buf[0]); + + return 0; +} + +static int +target_emulate_readcapacity_16(struct se_cmd *cmd) +{ + struct se_device *dev = SE_DEV(cmd); + unsigned char *buf = cmd->t_task->t_task_buf; + unsigned long long blocks = dev->transport->get_blocks(dev); + + buf[0] = (blocks >> 56) & 0xff; + buf[1] = (blocks >> 48) & 0xff; + buf[2] = (blocks >> 40) & 0xff; + buf[3] = (blocks >> 32) & 0xff; + buf[4] = (blocks >> 24) & 0xff; + buf[5] = (blocks >> 16) & 0xff; + buf[6] = (blocks >> 8) & 0xff; + buf[7] = blocks & 0xff; + buf[8] = (DEV_ATTRIB(dev)->block_size >> 24) & 0xff; + buf[9] = (DEV_ATTRIB(dev)->block_size >> 16) & 0xff; + buf[10] = (DEV_ATTRIB(dev)->block_size >> 8) & 0xff; + buf[11] = DEV_ATTRIB(dev)->block_size & 0xff; + /* + * Set Thin Provisioning Enable bit following sbc3r22 in section + * READ CAPACITY (16) byte 14 if emulate_tpu or emulate_tpws is enabled. + */ + if (DEV_ATTRIB(dev)->emulate_tpu || DEV_ATTRIB(dev)->emulate_tpws) + buf[14] = 0x80; + + return 0; +} + +static int +target_modesense_rwrecovery(unsigned char *p) +{ + p[0] = 0x01; + p[1] = 0x0a; + + return 12; +} + +static int +target_modesense_control(struct se_device *dev, unsigned char *p) +{ + p[0] = 0x0a; + p[1] = 0x0a; + p[2] = 2; + /* + * From spc4r17, section 7.4.6 Control mode Page + * + * Unit Attention interlocks control (UN_INTLCK_CTRL) to code 00b + * + * 00b: The logical unit shall clear any unit attention condition + * reported in the same I_T_L_Q nexus transaction as a CHECK CONDITION + * status and shall not establish a unit attention condition when a com- + * mand is completed with BUSY, TASK SET FULL, or RESERVATION CONFLICT + * status. + * + * 10b: The logical unit shall not clear any unit attention condition + * reported in the same I_T_L_Q nexus transaction as a CHECK CONDITION + * status and shall not establish a unit attention condition when + * a command is completed with BUSY, TASK SET FULL, or RESERVATION + * CONFLICT status. + * + * 11b a The logical unit shall not clear any unit attention condition + * reported in the same I_T_L_Q nexus transaction as a CHECK CONDITION + * status and shall establish a unit attention condition for the + * initiator port associated with the I_T nexus on which the BUSY, + * TASK SET FULL, or RESERVATION CONFLICT status is being returned. + * Depending on the status, the additional sense code shall be set to + * PREVIOUS BUSY STATUS, PREVIOUS TASK SET FULL STATUS, or PREVIOUS + * RESERVATION CONFLICT STATUS. Until it is cleared by a REQUEST SENSE + * command, a unit attention condition shall be established only once + * for a BUSY, TASK SET FULL, or RESERVATION CONFLICT status regardless + * to the number of commands completed with one of those status codes. + */ + p[4] = (DEV_ATTRIB(dev)->emulate_ua_intlck_ctrl == 2) ? 0x30 : + (DEV_ATTRIB(dev)->emulate_ua_intlck_ctrl == 1) ? 0x20 : 0x00; + /* + * From spc4r17, section 7.4.6 Control mode Page + * + * Task Aborted Status (TAS) bit set to zero. + * + * A task aborted status (TAS) bit set to zero specifies that aborted + * tasks shall be terminated by the device server without any response + * to the application client. A TAS bit set to one specifies that tasks + * aborted by the actions of an I_T nexus other than the I_T nexus on + * which the command was received shall be completed with TASK ABORTED + * status (see SAM-4). + */ + p[5] = (DEV_ATTRIB(dev)->emulate_tas) ? 0x40 : 0x00; + p[8] = 0xff; + p[9] = 0xff; + p[11] = 30; + + return 12; +} + +static int +target_modesense_caching(struct se_device *dev, unsigned char *p) +{ + p[0] = 0x08; + p[1] = 0x12; + if (DEV_ATTRIB(dev)->emulate_write_cache > 0) + p[2] = 0x04; /* Write Cache Enable */ + p[12] = 0x20; /* Disabled Read Ahead */ + + return 20; +} + +static void +target_modesense_write_protect(unsigned char *buf, int type) +{ + /* + * I believe that the WP bit (bit 7) in the mode header is the same for + * all device types.. + */ + switch (type) { + case TYPE_DISK: + case TYPE_TAPE: + default: + buf[0] |= 0x80; /* WP bit */ + break; + } +} + +static void +target_modesense_dpofua(unsigned char *buf, int type) +{ + switch (type) { + case TYPE_DISK: + buf[0] |= 0x10; /* DPOFUA bit */ + break; + default: + break; + } +} + +static int +target_emulate_modesense(struct se_cmd *cmd, int ten) +{ + struct se_device *dev = SE_DEV(cmd); + char *cdb = cmd->t_task->t_task_cdb; + unsigned char *rbuf = cmd->t_task->t_task_buf; + int type = dev->transport->get_device_type(dev); + int offset = (ten) ? 8 : 4; + int length = 0; + unsigned char buf[SE_MODE_PAGE_BUF]; + + memset(buf, 0, SE_MODE_PAGE_BUF); + + switch (cdb[2] & 0x3f) { + case 0x01: + length = target_modesense_rwrecovery(&buf[offset]); + break; + case 0x08: + length = target_modesense_caching(dev, &buf[offset]); + break; + case 0x0a: + length = target_modesense_control(dev, &buf[offset]); + break; + case 0x3f: + length = target_modesense_rwrecovery(&buf[offset]); + length += target_modesense_caching(dev, &buf[offset+length]); + length += target_modesense_control(dev, &buf[offset+length]); + break; + default: + printk(KERN_ERR "Got Unknown Mode Page: 0x%02x\n", + cdb[2] & 0x3f); + return PYX_TRANSPORT_UNKNOWN_MODE_PAGE; + } + offset += length; + + if (ten) { + offset -= 2; + buf[0] = (offset >> 8) & 0xff; + buf[1] = offset & 0xff; + + if ((SE_LUN(cmd)->lun_access & TRANSPORT_LUNFLAGS_READ_ONLY) || + (cmd->se_deve && + (cmd->se_deve->lun_flags & TRANSPORT_LUNFLAGS_READ_ONLY))) + target_modesense_write_protect(&buf[3], type); + + if ((DEV_ATTRIB(dev)->emulate_write_cache > 0) && + (DEV_ATTRIB(dev)->emulate_fua_write > 0)) + target_modesense_dpofua(&buf[3], type); + + if ((offset + 2) > cmd->data_length) + offset = cmd->data_length; + + } else { + offset -= 1; + buf[0] = offset & 0xff; + + if ((SE_LUN(cmd)->lun_access & TRANSPORT_LUNFLAGS_READ_ONLY) || + (cmd->se_deve && + (cmd->se_deve->lun_flags & TRANSPORT_LUNFLAGS_READ_ONLY))) + target_modesense_write_protect(&buf[2], type); + + if ((DEV_ATTRIB(dev)->emulate_write_cache > 0) && + (DEV_ATTRIB(dev)->emulate_fua_write > 0)) + target_modesense_dpofua(&buf[2], type); + + if ((offset + 1) > cmd->data_length) + offset = cmd->data_length; + } + memcpy(rbuf, buf, offset); + + return 0; +} + +static int +target_emulate_request_sense(struct se_cmd *cmd) +{ + unsigned char *cdb = cmd->t_task->t_task_cdb; + unsigned char *buf = cmd->t_task->t_task_buf; + u8 ua_asc = 0, ua_ascq = 0; + + if (cdb[1] & 0x01) { + printk(KERN_ERR "REQUEST_SENSE description emulation not" + " supported\n"); + return PYX_TRANSPORT_INVALID_CDB_FIELD; + } + if (!(core_scsi3_ua_clear_for_request_sense(cmd, &ua_asc, &ua_ascq))) { + /* + * CURRENT ERROR, UNIT ATTENTION + */ + buf[0] = 0x70; + buf[SPC_SENSE_KEY_OFFSET] = UNIT_ATTENTION; + /* + * Make sure request data length is enough for additional + * sense data. + */ + if (cmd->data_length <= 18) { + buf[7] = 0x00; + return 0; + } + /* + * The Additional Sense Code (ASC) from the UNIT ATTENTION + */ + buf[SPC_ASC_KEY_OFFSET] = ua_asc; + buf[SPC_ASCQ_KEY_OFFSET] = ua_ascq; + buf[7] = 0x0A; + } else { + /* + * CURRENT ERROR, NO SENSE + */ + buf[0] = 0x70; + buf[SPC_SENSE_KEY_OFFSET] = NO_SENSE; + /* + * Make sure request data length is enough for additional + * sense data. + */ + if (cmd->data_length <= 18) { + buf[7] = 0x00; + return 0; + } + /* + * NO ADDITIONAL SENSE INFORMATION + */ + buf[SPC_ASC_KEY_OFFSET] = 0x00; + buf[7] = 0x0A; + } + + return 0; +} + +/* + * Used for TCM/IBLOCK and TCM/FILEIO for block/blk-lib.c level discard support. + * Note this is not used for TCM/pSCSI passthrough + */ +static int +target_emulate_unmap(struct se_task *task) +{ + struct se_cmd *cmd = TASK_CMD(task); + struct se_device *dev = SE_DEV(cmd); + unsigned char *buf = cmd->t_task->t_task_buf, *ptr = NULL; + unsigned char *cdb = &cmd->t_task->t_task_cdb[0]; + sector_t lba; + unsigned int size = cmd->data_length, range; + int ret, offset; + unsigned short dl, bd_dl; + + /* First UNMAP block descriptor starts at 8 byte offset */ + offset = 8; + size -= 8; + dl = get_unaligned_be16(&cdb[0]); + bd_dl = get_unaligned_be16(&cdb[2]); + ptr = &buf[offset]; + printk(KERN_INFO "UNMAP: Sub: %s Using dl: %hu bd_dl: %hu size: %hu" + " ptr: %p\n", dev->transport->name, dl, bd_dl, size, ptr); + + while (size) { + lba = get_unaligned_be64(&ptr[0]); + range = get_unaligned_be32(&ptr[8]); + printk(KERN_INFO "UNMAP: Using lba: %llu and range: %u\n", + (unsigned long long)lba, range); + + ret = dev->transport->do_discard(dev, lba, range); + if (ret < 0) { + printk(KERN_ERR "blkdev_issue_discard() failed: %d\n", + ret); + return -1; + } + + ptr += 16; + size -= 16; + } + + task->task_scsi_status = GOOD; + transport_complete_task(task, 1); + return 0; +} + +/* + * Used for TCM/IBLOCK and TCM/FILEIO for block/blk-lib.c level discard support. + * Note this is not used for TCM/pSCSI passthrough + */ +static int +target_emulate_write_same(struct se_task *task) +{ + struct se_cmd *cmd = TASK_CMD(task); + struct se_device *dev = SE_DEV(cmd); + sector_t lba = cmd->t_task->t_task_lba; + unsigned int range; + int ret; + + range = (cmd->data_length / DEV_ATTRIB(dev)->block_size); + + printk(KERN_INFO "WRITE_SAME UNMAP: LBA: %llu Range: %u\n", + (unsigned long long)lba, range); + + ret = dev->transport->do_discard(dev, lba, range); + if (ret < 0) { + printk(KERN_INFO "blkdev_issue_discard() failed for WRITE_SAME\n"); + return -1; + } + + task->task_scsi_status = GOOD; + transport_complete_task(task, 1); + return 0; +} + +int +transport_emulate_control_cdb(struct se_task *task) +{ + struct se_cmd *cmd = TASK_CMD(task); + struct se_device *dev = SE_DEV(cmd); + unsigned short service_action; + int ret = 0; + + switch (cmd->t_task->t_task_cdb[0]) { + case INQUIRY: + ret = target_emulate_inquiry(cmd); + break; + case READ_CAPACITY: + ret = target_emulate_readcapacity(cmd); + break; + case MODE_SENSE: + ret = target_emulate_modesense(cmd, 0); + break; + case MODE_SENSE_10: + ret = target_emulate_modesense(cmd, 1); + break; + case SERVICE_ACTION_IN: + switch (cmd->t_task->t_task_cdb[1] & 0x1f) { + case SAI_READ_CAPACITY_16: + ret = target_emulate_readcapacity_16(cmd); + break; + default: + printk(KERN_ERR "Unsupported SA: 0x%02x\n", + cmd->t_task->t_task_cdb[1] & 0x1f); + return PYX_TRANSPORT_UNKNOWN_SAM_OPCODE; + } + break; + case REQUEST_SENSE: + ret = target_emulate_request_sense(cmd); + break; + case UNMAP: + if (!dev->transport->do_discard) { + printk(KERN_ERR "UNMAP emulation not supported for: %s\n", + dev->transport->name); + return PYX_TRANSPORT_UNKNOWN_SAM_OPCODE; + } + ret = target_emulate_unmap(task); + break; + case WRITE_SAME_16: + if (!dev->transport->do_discard) { + printk(KERN_ERR "WRITE_SAME_16 emulation not supported" + " for: %s\n", dev->transport->name); + return PYX_TRANSPORT_UNKNOWN_SAM_OPCODE; + } + ret = target_emulate_write_same(task); + break; + case VARIABLE_LENGTH_CMD: + service_action = + get_unaligned_be16(&cmd->t_task->t_task_cdb[8]); + switch (service_action) { + case WRITE_SAME_32: + if (!dev->transport->do_discard) { + printk(KERN_ERR "WRITE_SAME_32 SA emulation not" + " supported for: %s\n", + dev->transport->name); + return PYX_TRANSPORT_UNKNOWN_SAM_OPCODE; + } + ret = target_emulate_write_same(task); + break; + default: + printk(KERN_ERR "Unsupported VARIABLE_LENGTH_CMD SA:" + " 0x%02x\n", service_action); + break; + } + break; + case SYNCHRONIZE_CACHE: + case 0x91: /* SYNCHRONIZE_CACHE_16: */ + if (!dev->transport->do_sync_cache) { + printk(KERN_ERR + "SYNCHRONIZE_CACHE emulation not supported" + " for: %s\n", dev->transport->name); + return PYX_TRANSPORT_UNKNOWN_SAM_OPCODE; + } + dev->transport->do_sync_cache(task); + break; + case ALLOW_MEDIUM_REMOVAL: + case ERASE: + case REZERO_UNIT: + case SEEK_10: + case SPACE: + case START_STOP: + case TEST_UNIT_READY: + case VERIFY: + case WRITE_FILEMARKS: + break; + default: + printk(KERN_ERR "Unsupported SCSI Opcode: 0x%02x for %s\n", + cmd->t_task->t_task_cdb[0], dev->transport->name); + return PYX_TRANSPORT_UNKNOWN_SAM_OPCODE; + } + + if (ret < 0) + return ret; + task->task_scsi_status = GOOD; + transport_complete_task(task, 1); + + return PYX_TRANSPORT_SENT_TO_TRANSPORT; +} diff --git a/drivers/target/target_core_configfs.c b/drivers/target/target_core_configfs.c new file mode 100644 index 000000000000..2764510798b0 --- /dev/null +++ b/drivers/target/target_core_configfs.c @@ -0,0 +1,3225 @@ +/******************************************************************************* + * Filename: target_core_configfs.c + * + * This file contains ConfigFS logic for the Generic Target Engine project. + * + * Copyright (c) 2008-2010 Rising Tide Systems + * Copyright (c) 2008-2010 Linux-iSCSI.org + * + * Nicholas A. Bellinger + * + * based on configfs Copyright (C) 2005 Oracle. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + ****************************************************************************/ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include + +#include "target_core_alua.h" +#include "target_core_hba.h" +#include "target_core_pr.h" +#include "target_core_rd.h" + +static struct list_head g_tf_list; +static struct mutex g_tf_lock; + +struct target_core_configfs_attribute { + struct configfs_attribute attr; + ssize_t (*show)(void *, char *); + ssize_t (*store)(void *, const char *, size_t); +}; + +static inline struct se_hba * +item_to_hba(struct config_item *item) +{ + return container_of(to_config_group(item), struct se_hba, hba_group); +} + +/* + * Attributes for /sys/kernel/config/target/ + */ +static ssize_t target_core_attr_show(struct config_item *item, + struct configfs_attribute *attr, + char *page) +{ + return sprintf(page, "Target Engine Core ConfigFS Infrastructure %s" + " on %s/%s on "UTS_RELEASE"\n", TARGET_CORE_CONFIGFS_VERSION, + utsname()->sysname, utsname()->machine); +} + +static struct configfs_item_operations target_core_fabric_item_ops = { + .show_attribute = target_core_attr_show, +}; + +static struct configfs_attribute target_core_item_attr_version = { + .ca_owner = THIS_MODULE, + .ca_name = "version", + .ca_mode = S_IRUGO, +}; + +static struct target_fabric_configfs *target_core_get_fabric( + const char *name) +{ + struct target_fabric_configfs *tf; + + if (!(name)) + return NULL; + + mutex_lock(&g_tf_lock); + list_for_each_entry(tf, &g_tf_list, tf_list) { + if (!(strcmp(tf->tf_name, name))) { + atomic_inc(&tf->tf_access_cnt); + mutex_unlock(&g_tf_lock); + return tf; + } + } + mutex_unlock(&g_tf_lock); + + return NULL; +} + +/* + * Called from struct target_core_group_ops->make_group() + */ +static struct config_group *target_core_register_fabric( + struct config_group *group, + const char *name) +{ + struct target_fabric_configfs *tf; + int ret; + + printk(KERN_INFO "Target_Core_ConfigFS: REGISTER -> group: %p name:" + " %s\n", group, name); + /* + * Ensure that TCM subsystem plugins are loaded at this point for + * using the RAMDISK_DR virtual LUN 0 and all other struct se_port + * LUN symlinks. + */ + if (transport_subsystem_check_init() < 0) + return ERR_PTR(-EINVAL); + + /* + * Below are some hardcoded request_module() calls to automatically + * local fabric modules when the following is called: + * + * mkdir -p /sys/kernel/config/target/$MODULE_NAME + * + * Note that this does not limit which TCM fabric module can be + * registered, but simply provids auto loading logic for modules with + * mkdir(2) system calls with known TCM fabric modules. + */ + if (!(strncmp(name, "iscsi", 5))) { + /* + * Automatically load the LIO Target fabric module when the + * following is called: + * + * mkdir -p $CONFIGFS/target/iscsi + */ + ret = request_module("iscsi_target_mod"); + if (ret < 0) { + printk(KERN_ERR "request_module() failed for" + " iscsi_target_mod.ko: %d\n", ret); + return ERR_PTR(-EINVAL); + } + } else if (!(strncmp(name, "loopback", 8))) { + /* + * Automatically load the tcm_loop fabric module when the + * following is called: + * + * mkdir -p $CONFIGFS/target/loopback + */ + ret = request_module("tcm_loop"); + if (ret < 0) { + printk(KERN_ERR "request_module() failed for" + " tcm_loop.ko: %d\n", ret); + return ERR_PTR(-EINVAL); + } + } + + tf = target_core_get_fabric(name); + if (!(tf)) { + printk(KERN_ERR "target_core_get_fabric() failed for %s\n", + name); + return ERR_PTR(-EINVAL); + } + printk(KERN_INFO "Target_Core_ConfigFS: REGISTER -> Located fabric:" + " %s\n", tf->tf_name); + /* + * On a successful target_core_get_fabric() look, the returned + * struct target_fabric_configfs *tf will contain a usage reference. + */ + printk(KERN_INFO "Target_Core_ConfigFS: REGISTER tfc_wwn_cit -> %p\n", + &TF_CIT_TMPL(tf)->tfc_wwn_cit); + + tf->tf_group.default_groups = tf->tf_default_groups; + tf->tf_group.default_groups[0] = &tf->tf_disc_group; + tf->tf_group.default_groups[1] = NULL; + + config_group_init_type_name(&tf->tf_group, name, + &TF_CIT_TMPL(tf)->tfc_wwn_cit); + config_group_init_type_name(&tf->tf_disc_group, "discovery_auth", + &TF_CIT_TMPL(tf)->tfc_discovery_cit); + + printk(KERN_INFO "Target_Core_ConfigFS: REGISTER -> Allocated Fabric:" + " %s\n", tf->tf_group.cg_item.ci_name); + /* + * Setup tf_ops.tf_subsys pointer for usage with configfs_depend_item() + */ + tf->tf_ops.tf_subsys = tf->tf_subsys; + tf->tf_fabric = &tf->tf_group.cg_item; + printk(KERN_INFO "Target_Core_ConfigFS: REGISTER -> Set tf->tf_fabric" + " for %s\n", name); + + return &tf->tf_group; +} + +/* + * Called from struct target_core_group_ops->drop_item() + */ +static void target_core_deregister_fabric( + struct config_group *group, + struct config_item *item) +{ + struct target_fabric_configfs *tf = container_of( + to_config_group(item), struct target_fabric_configfs, tf_group); + struct config_group *tf_group; + struct config_item *df_item; + int i; + + printk(KERN_INFO "Target_Core_ConfigFS: DEREGISTER -> Looking up %s in" + " tf list\n", config_item_name(item)); + + printk(KERN_INFO "Target_Core_ConfigFS: DEREGISTER -> located fabric:" + " %s\n", tf->tf_name); + atomic_dec(&tf->tf_access_cnt); + + printk(KERN_INFO "Target_Core_ConfigFS: DEREGISTER -> Releasing" + " tf->tf_fabric for %s\n", tf->tf_name); + tf->tf_fabric = NULL; + + printk(KERN_INFO "Target_Core_ConfigFS: DEREGISTER -> Releasing ci" + " %s\n", config_item_name(item)); + + tf_group = &tf->tf_group; + for (i = 0; tf_group->default_groups[i]; i++) { + df_item = &tf_group->default_groups[i]->cg_item; + tf_group->default_groups[i] = NULL; + config_item_put(df_item); + } + config_item_put(item); +} + +static struct configfs_group_operations target_core_fabric_group_ops = { + .make_group = &target_core_register_fabric, + .drop_item = &target_core_deregister_fabric, +}; + +/* + * All item attributes appearing in /sys/kernel/target/ appear here. + */ +static struct configfs_attribute *target_core_fabric_item_attrs[] = { + &target_core_item_attr_version, + NULL, +}; + +/* + * Provides Fabrics Groups and Item Attributes for /sys/kernel/config/target/ + */ +static struct config_item_type target_core_fabrics_item = { + .ct_item_ops = &target_core_fabric_item_ops, + .ct_group_ops = &target_core_fabric_group_ops, + .ct_attrs = target_core_fabric_item_attrs, + .ct_owner = THIS_MODULE, +}; + +static struct configfs_subsystem target_core_fabrics = { + .su_group = { + .cg_item = { + .ci_namebuf = "target", + .ci_type = &target_core_fabrics_item, + }, + }, +}; + +static struct configfs_subsystem *target_core_subsystem[] = { + &target_core_fabrics, + NULL, +}; + +/*############################################################################## +// Start functions called by external Target Fabrics Modules +//############################################################################*/ + +/* + * First function called by fabric modules to: + * + * 1) Allocate a struct target_fabric_configfs and save the *fabric_cit pointer. + * 2) Add struct target_fabric_configfs to g_tf_list + * 3) Return struct target_fabric_configfs to fabric module to be passed + * into target_fabric_configfs_register(). + */ +struct target_fabric_configfs *target_fabric_configfs_init( + struct module *fabric_mod, + const char *name) +{ + struct target_fabric_configfs *tf; + + if (!(fabric_mod)) { + printk(KERN_ERR "Missing struct module *fabric_mod pointer\n"); + return NULL; + } + if (!(name)) { + printk(KERN_ERR "Unable to locate passed fabric name\n"); + return NULL; + } + if (strlen(name) > TARGET_FABRIC_NAME_SIZE) { + printk(KERN_ERR "Passed name: %s exceeds TARGET_FABRIC" + "_NAME_SIZE\n", name); + return NULL; + } + + tf = kzalloc(sizeof(struct target_fabric_configfs), GFP_KERNEL); + if (!(tf)) + return ERR_PTR(-ENOMEM); + + INIT_LIST_HEAD(&tf->tf_list); + atomic_set(&tf->tf_access_cnt, 0); + /* + * Setup the default generic struct config_item_type's (cits) in + * struct target_fabric_configfs->tf_cit_tmpl + */ + tf->tf_module = fabric_mod; + target_fabric_setup_cits(tf); + + tf->tf_subsys = target_core_subsystem[0]; + snprintf(tf->tf_name, TARGET_FABRIC_NAME_SIZE, "%s", name); + + mutex_lock(&g_tf_lock); + list_add_tail(&tf->tf_list, &g_tf_list); + mutex_unlock(&g_tf_lock); + + printk(KERN_INFO "<<<<<<<<<<<<<<<<<<<<<< BEGIN FABRIC API >>>>>>>>" + ">>>>>>>>>>>>>>\n"); + printk(KERN_INFO "Initialized struct target_fabric_configfs: %p for" + " %s\n", tf, tf->tf_name); + return tf; +} +EXPORT_SYMBOL(target_fabric_configfs_init); + +/* + * Called by fabric plugins after FAILED target_fabric_configfs_register() call. + */ +void target_fabric_configfs_free( + struct target_fabric_configfs *tf) +{ + mutex_lock(&g_tf_lock); + list_del(&tf->tf_list); + mutex_unlock(&g_tf_lock); + + kfree(tf); +} +EXPORT_SYMBOL(target_fabric_configfs_free); + +/* + * Perform a sanity check of the passed tf->tf_ops before completing + * TCM fabric module registration. + */ +static int target_fabric_tf_ops_check( + struct target_fabric_configfs *tf) +{ + struct target_core_fabric_ops *tfo = &tf->tf_ops; + + if (!(tfo->get_fabric_name)) { + printk(KERN_ERR "Missing tfo->get_fabric_name()\n"); + return -EINVAL; + } + if (!(tfo->get_fabric_proto_ident)) { + printk(KERN_ERR "Missing tfo->get_fabric_proto_ident()\n"); + return -EINVAL; + } + if (!(tfo->tpg_get_wwn)) { + printk(KERN_ERR "Missing tfo->tpg_get_wwn()\n"); + return -EINVAL; + } + if (!(tfo->tpg_get_tag)) { + printk(KERN_ERR "Missing tfo->tpg_get_tag()\n"); + return -EINVAL; + } + if (!(tfo->tpg_get_default_depth)) { + printk(KERN_ERR "Missing tfo->tpg_get_default_depth()\n"); + return -EINVAL; + } + if (!(tfo->tpg_get_pr_transport_id)) { + printk(KERN_ERR "Missing tfo->tpg_get_pr_transport_id()\n"); + return -EINVAL; + } + if (!(tfo->tpg_get_pr_transport_id_len)) { + printk(KERN_ERR "Missing tfo->tpg_get_pr_transport_id_len()\n"); + return -EINVAL; + } + if (!(tfo->tpg_check_demo_mode)) { + printk(KERN_ERR "Missing tfo->tpg_check_demo_mode()\n"); + return -EINVAL; + } + if (!(tfo->tpg_check_demo_mode_cache)) { + printk(KERN_ERR "Missing tfo->tpg_check_demo_mode_cache()\n"); + return -EINVAL; + } + if (!(tfo->tpg_check_demo_mode_write_protect)) { + printk(KERN_ERR "Missing tfo->tpg_check_demo_mode_write_protect()\n"); + return -EINVAL; + } + if (!(tfo->tpg_check_prod_mode_write_protect)) { + printk(KERN_ERR "Missing tfo->tpg_check_prod_mode_write_protect()\n"); + return -EINVAL; + } + if (!(tfo->tpg_alloc_fabric_acl)) { + printk(KERN_ERR "Missing tfo->tpg_alloc_fabric_acl()\n"); + return -EINVAL; + } + if (!(tfo->tpg_release_fabric_acl)) { + printk(KERN_ERR "Missing tfo->tpg_release_fabric_acl()\n"); + return -EINVAL; + } + if (!(tfo->tpg_get_inst_index)) { + printk(KERN_ERR "Missing tfo->tpg_get_inst_index()\n"); + return -EINVAL; + } + if (!(tfo->release_cmd_to_pool)) { + printk(KERN_ERR "Missing tfo->release_cmd_to_pool()\n"); + return -EINVAL; + } + if (!(tfo->release_cmd_direct)) { + printk(KERN_ERR "Missing tfo->release_cmd_direct()\n"); + return -EINVAL; + } + if (!(tfo->shutdown_session)) { + printk(KERN_ERR "Missing tfo->shutdown_session()\n"); + return -EINVAL; + } + if (!(tfo->close_session)) { + printk(KERN_ERR "Missing tfo->close_session()\n"); + return -EINVAL; + } + if (!(tfo->stop_session)) { + printk(KERN_ERR "Missing tfo->stop_session()\n"); + return -EINVAL; + } + if (!(tfo->fall_back_to_erl0)) { + printk(KERN_ERR "Missing tfo->fall_back_to_erl0()\n"); + return -EINVAL; + } + if (!(tfo->sess_logged_in)) { + printk(KERN_ERR "Missing tfo->sess_logged_in()\n"); + return -EINVAL; + } + if (!(tfo->sess_get_index)) { + printk(KERN_ERR "Missing tfo->sess_get_index()\n"); + return -EINVAL; + } + if (!(tfo->write_pending)) { + printk(KERN_ERR "Missing tfo->write_pending()\n"); + return -EINVAL; + } + if (!(tfo->write_pending_status)) { + printk(KERN_ERR "Missing tfo->write_pending_status()\n"); + return -EINVAL; + } + if (!(tfo->set_default_node_attributes)) { + printk(KERN_ERR "Missing tfo->set_default_node_attributes()\n"); + return -EINVAL; + } + if (!(tfo->get_task_tag)) { + printk(KERN_ERR "Missing tfo->get_task_tag()\n"); + return -EINVAL; + } + if (!(tfo->get_cmd_state)) { + printk(KERN_ERR "Missing tfo->get_cmd_state()\n"); + return -EINVAL; + } + if (!(tfo->new_cmd_failure)) { + printk(KERN_ERR "Missing tfo->new_cmd_failure()\n"); + return -EINVAL; + } + if (!(tfo->queue_data_in)) { + printk(KERN_ERR "Missing tfo->queue_data_in()\n"); + return -EINVAL; + } + if (!(tfo->queue_status)) { + printk(KERN_ERR "Missing tfo->queue_status()\n"); + return -EINVAL; + } + if (!(tfo->queue_tm_rsp)) { + printk(KERN_ERR "Missing tfo->queue_tm_rsp()\n"); + return -EINVAL; + } + if (!(tfo->set_fabric_sense_len)) { + printk(KERN_ERR "Missing tfo->set_fabric_sense_len()\n"); + return -EINVAL; + } + if (!(tfo->get_fabric_sense_len)) { + printk(KERN_ERR "Missing tfo->get_fabric_sense_len()\n"); + return -EINVAL; + } + if (!(tfo->is_state_remove)) { + printk(KERN_ERR "Missing tfo->is_state_remove()\n"); + return -EINVAL; + } + if (!(tfo->pack_lun)) { + printk(KERN_ERR "Missing tfo->pack_lun()\n"); + return -EINVAL; + } + /* + * We at least require tfo->fabric_make_wwn(), tfo->fabric_drop_wwn() + * tfo->fabric_make_tpg() and tfo->fabric_drop_tpg() in + * target_core_fabric_configfs.c WWN+TPG group context code. + */ + if (!(tfo->fabric_make_wwn)) { + printk(KERN_ERR "Missing tfo->fabric_make_wwn()\n"); + return -EINVAL; + } + if (!(tfo->fabric_drop_wwn)) { + printk(KERN_ERR "Missing tfo->fabric_drop_wwn()\n"); + return -EINVAL; + } + if (!(tfo->fabric_make_tpg)) { + printk(KERN_ERR "Missing tfo->fabric_make_tpg()\n"); + return -EINVAL; + } + if (!(tfo->fabric_drop_tpg)) { + printk(KERN_ERR "Missing tfo->fabric_drop_tpg()\n"); + return -EINVAL; + } + + return 0; +} + +/* + * Called 2nd from fabric module with returned parameter of + * struct target_fabric_configfs * from target_fabric_configfs_init(). + * + * Upon a successful registration, the new fabric's struct config_item is + * return. Also, a pointer to this struct is set in the passed + * struct target_fabric_configfs. + */ +int target_fabric_configfs_register( + struct target_fabric_configfs *tf) +{ + struct config_group *su_group; + int ret; + + if (!(tf)) { + printk(KERN_ERR "Unable to locate target_fabric_configfs" + " pointer\n"); + return -EINVAL; + } + if (!(tf->tf_subsys)) { + printk(KERN_ERR "Unable to target struct config_subsystem" + " pointer\n"); + return -EINVAL; + } + su_group = &tf->tf_subsys->su_group; + if (!(su_group)) { + printk(KERN_ERR "Unable to locate target struct config_group" + " pointer\n"); + return -EINVAL; + } + ret = target_fabric_tf_ops_check(tf); + if (ret < 0) + return ret; + + printk(KERN_INFO "<<<<<<<<<<<<<<<<<<<<<< END FABRIC API >>>>>>>>>>>>" + ">>>>>>>>>>\n"); + return 0; +} +EXPORT_SYMBOL(target_fabric_configfs_register); + +void target_fabric_configfs_deregister( + struct target_fabric_configfs *tf) +{ + struct config_group *su_group; + struct configfs_subsystem *su; + + if (!(tf)) { + printk(KERN_ERR "Unable to locate passed target_fabric_" + "configfs\n"); + return; + } + su = tf->tf_subsys; + if (!(su)) { + printk(KERN_ERR "Unable to locate passed tf->tf_subsys" + " pointer\n"); + return; + } + su_group = &tf->tf_subsys->su_group; + if (!(su_group)) { + printk(KERN_ERR "Unable to locate target struct config_group" + " pointer\n"); + return; + } + + printk(KERN_INFO "<<<<<<<<<<<<<<<<<<<<<< BEGIN FABRIC API >>>>>>>>>>" + ">>>>>>>>>>>>\n"); + mutex_lock(&g_tf_lock); + if (atomic_read(&tf->tf_access_cnt)) { + mutex_unlock(&g_tf_lock); + printk(KERN_ERR "Non zero tf->tf_access_cnt for fabric %s\n", + tf->tf_name); + BUG(); + } + list_del(&tf->tf_list); + mutex_unlock(&g_tf_lock); + + printk(KERN_INFO "Target_Core_ConfigFS: DEREGISTER -> Releasing tf:" + " %s\n", tf->tf_name); + tf->tf_module = NULL; + tf->tf_subsys = NULL; + kfree(tf); + + printk("<<<<<<<<<<<<<<<<<<<<<< END FABRIC API >>>>>>>>>>>>>>>>>" + ">>>>>\n"); + return; +} +EXPORT_SYMBOL(target_fabric_configfs_deregister); + +/*############################################################################## +// Stop functions called by external Target Fabrics Modules +//############################################################################*/ + +/* Start functions for struct config_item_type target_core_dev_attrib_cit */ + +#define DEF_DEV_ATTRIB_SHOW(_name) \ +static ssize_t target_core_dev_show_attr_##_name( \ + struct se_dev_attrib *da, \ + char *page) \ +{ \ + struct se_device *dev; \ + struct se_subsystem_dev *se_dev = da->da_sub_dev; \ + ssize_t rb; \ + \ + spin_lock(&se_dev->se_dev_lock); \ + dev = se_dev->se_dev_ptr; \ + if (!(dev)) { \ + spin_unlock(&se_dev->se_dev_lock); \ + return -ENODEV; \ + } \ + rb = snprintf(page, PAGE_SIZE, "%u\n", (u32)DEV_ATTRIB(dev)->_name); \ + spin_unlock(&se_dev->se_dev_lock); \ + \ + return rb; \ +} + +#define DEF_DEV_ATTRIB_STORE(_name) \ +static ssize_t target_core_dev_store_attr_##_name( \ + struct se_dev_attrib *da, \ + const char *page, \ + size_t count) \ +{ \ + struct se_device *dev; \ + struct se_subsystem_dev *se_dev = da->da_sub_dev; \ + unsigned long val; \ + int ret; \ + \ + spin_lock(&se_dev->se_dev_lock); \ + dev = se_dev->se_dev_ptr; \ + if (!(dev)) { \ + spin_unlock(&se_dev->se_dev_lock); \ + return -ENODEV; \ + } \ + ret = strict_strtoul(page, 0, &val); \ + if (ret < 0) { \ + spin_unlock(&se_dev->se_dev_lock); \ + printk(KERN_ERR "strict_strtoul() failed with" \ + " ret: %d\n", ret); \ + return -EINVAL; \ + } \ + ret = se_dev_set_##_name(dev, (u32)val); \ + spin_unlock(&se_dev->se_dev_lock); \ + \ + return (!ret) ? count : -EINVAL; \ +} + +#define DEF_DEV_ATTRIB(_name) \ +DEF_DEV_ATTRIB_SHOW(_name); \ +DEF_DEV_ATTRIB_STORE(_name); + +#define DEF_DEV_ATTRIB_RO(_name) \ +DEF_DEV_ATTRIB_SHOW(_name); + +CONFIGFS_EATTR_STRUCT(target_core_dev_attrib, se_dev_attrib); +#define SE_DEV_ATTR(_name, _mode) \ +static struct target_core_dev_attrib_attribute \ + target_core_dev_attrib_##_name = \ + __CONFIGFS_EATTR(_name, _mode, \ + target_core_dev_show_attr_##_name, \ + target_core_dev_store_attr_##_name); + +#define SE_DEV_ATTR_RO(_name); \ +static struct target_core_dev_attrib_attribute \ + target_core_dev_attrib_##_name = \ + __CONFIGFS_EATTR_RO(_name, \ + target_core_dev_show_attr_##_name); + +DEF_DEV_ATTRIB(emulate_dpo); +SE_DEV_ATTR(emulate_dpo, S_IRUGO | S_IWUSR); + +DEF_DEV_ATTRIB(emulate_fua_write); +SE_DEV_ATTR(emulate_fua_write, S_IRUGO | S_IWUSR); + +DEF_DEV_ATTRIB(emulate_fua_read); +SE_DEV_ATTR(emulate_fua_read, S_IRUGO | S_IWUSR); + +DEF_DEV_ATTRIB(emulate_write_cache); +SE_DEV_ATTR(emulate_write_cache, S_IRUGO | S_IWUSR); + +DEF_DEV_ATTRIB(emulate_ua_intlck_ctrl); +SE_DEV_ATTR(emulate_ua_intlck_ctrl, S_IRUGO | S_IWUSR); + +DEF_DEV_ATTRIB(emulate_tas); +SE_DEV_ATTR(emulate_tas, S_IRUGO | S_IWUSR); + +DEF_DEV_ATTRIB(emulate_tpu); +SE_DEV_ATTR(emulate_tpu, S_IRUGO | S_IWUSR); + +DEF_DEV_ATTRIB(emulate_tpws); +SE_DEV_ATTR(emulate_tpws, S_IRUGO | S_IWUSR); + +DEF_DEV_ATTRIB(enforce_pr_isids); +SE_DEV_ATTR(enforce_pr_isids, S_IRUGO | S_IWUSR); + +DEF_DEV_ATTRIB_RO(hw_block_size); +SE_DEV_ATTR_RO(hw_block_size); + +DEF_DEV_ATTRIB(block_size); +SE_DEV_ATTR(block_size, S_IRUGO | S_IWUSR); + +DEF_DEV_ATTRIB_RO(hw_max_sectors); +SE_DEV_ATTR_RO(hw_max_sectors); + +DEF_DEV_ATTRIB(max_sectors); +SE_DEV_ATTR(max_sectors, S_IRUGO | S_IWUSR); + +DEF_DEV_ATTRIB(optimal_sectors); +SE_DEV_ATTR(optimal_sectors, S_IRUGO | S_IWUSR); + +DEF_DEV_ATTRIB_RO(hw_queue_depth); +SE_DEV_ATTR_RO(hw_queue_depth); + +DEF_DEV_ATTRIB(queue_depth); +SE_DEV_ATTR(queue_depth, S_IRUGO | S_IWUSR); + +DEF_DEV_ATTRIB(task_timeout); +SE_DEV_ATTR(task_timeout, S_IRUGO | S_IWUSR); + +DEF_DEV_ATTRIB(max_unmap_lba_count); +SE_DEV_ATTR(max_unmap_lba_count, S_IRUGO | S_IWUSR); + +DEF_DEV_ATTRIB(max_unmap_block_desc_count); +SE_DEV_ATTR(max_unmap_block_desc_count, S_IRUGO | S_IWUSR); + +DEF_DEV_ATTRIB(unmap_granularity); +SE_DEV_ATTR(unmap_granularity, S_IRUGO | S_IWUSR); + +DEF_DEV_ATTRIB(unmap_granularity_alignment); +SE_DEV_ATTR(unmap_granularity_alignment, S_IRUGO | S_IWUSR); + +CONFIGFS_EATTR_OPS(target_core_dev_attrib, se_dev_attrib, da_group); + +static struct configfs_attribute *target_core_dev_attrib_attrs[] = { + &target_core_dev_attrib_emulate_dpo.attr, + &target_core_dev_attrib_emulate_fua_write.attr, + &target_core_dev_attrib_emulate_fua_read.attr, + &target_core_dev_attrib_emulate_write_cache.attr, + &target_core_dev_attrib_emulate_ua_intlck_ctrl.attr, + &target_core_dev_attrib_emulate_tas.attr, + &target_core_dev_attrib_emulate_tpu.attr, + &target_core_dev_attrib_emulate_tpws.attr, + &target_core_dev_attrib_enforce_pr_isids.attr, + &target_core_dev_attrib_hw_block_size.attr, + &target_core_dev_attrib_block_size.attr, + &target_core_dev_attrib_hw_max_sectors.attr, + &target_core_dev_attrib_max_sectors.attr, + &target_core_dev_attrib_optimal_sectors.attr, + &target_core_dev_attrib_hw_queue_depth.attr, + &target_core_dev_attrib_queue_depth.attr, + &target_core_dev_attrib_task_timeout.attr, + &target_core_dev_attrib_max_unmap_lba_count.attr, + &target_core_dev_attrib_max_unmap_block_desc_count.attr, + &target_core_dev_attrib_unmap_granularity.attr, + &target_core_dev_attrib_unmap_granularity_alignment.attr, + NULL, +}; + +static struct configfs_item_operations target_core_dev_attrib_ops = { + .show_attribute = target_core_dev_attrib_attr_show, + .store_attribute = target_core_dev_attrib_attr_store, +}; + +static struct config_item_type target_core_dev_attrib_cit = { + .ct_item_ops = &target_core_dev_attrib_ops, + .ct_attrs = target_core_dev_attrib_attrs, + .ct_owner = THIS_MODULE, +}; + +/* End functions for struct config_item_type target_core_dev_attrib_cit */ + +/* Start functions for struct config_item_type target_core_dev_wwn_cit */ + +CONFIGFS_EATTR_STRUCT(target_core_dev_wwn, t10_wwn); +#define SE_DEV_WWN_ATTR(_name, _mode) \ +static struct target_core_dev_wwn_attribute target_core_dev_wwn_##_name = \ + __CONFIGFS_EATTR(_name, _mode, \ + target_core_dev_wwn_show_attr_##_name, \ + target_core_dev_wwn_store_attr_##_name); + +#define SE_DEV_WWN_ATTR_RO(_name); \ +do { \ + static struct target_core_dev_wwn_attribute \ + target_core_dev_wwn_##_name = \ + __CONFIGFS_EATTR_RO(_name, \ + target_core_dev_wwn_show_attr_##_name); \ +} while (0); + +/* + * VPD page 0x80 Unit serial + */ +static ssize_t target_core_dev_wwn_show_attr_vpd_unit_serial( + struct t10_wwn *t10_wwn, + char *page) +{ + struct se_subsystem_dev *se_dev = t10_wwn->t10_sub_dev; + struct se_device *dev; + + dev = se_dev->se_dev_ptr; + if (!(dev)) + return -ENODEV; + + return sprintf(page, "T10 VPD Unit Serial Number: %s\n", + &t10_wwn->unit_serial[0]); +} + +static ssize_t target_core_dev_wwn_store_attr_vpd_unit_serial( + struct t10_wwn *t10_wwn, + const char *page, + size_t count) +{ + struct se_subsystem_dev *su_dev = t10_wwn->t10_sub_dev; + struct se_device *dev; + unsigned char buf[INQUIRY_VPD_SERIAL_LEN]; + + /* + * If Linux/SCSI subsystem_api_t plugin got a VPD Unit Serial + * from the struct scsi_device level firmware, do not allow + * VPD Unit Serial to be emulated. + * + * Note this struct scsi_device could also be emulating VPD + * information from its drivers/scsi LLD. But for now we assume + * it is doing 'the right thing' wrt a world wide unique + * VPD Unit Serial Number that OS dependent multipath can depend on. + */ + if (su_dev->su_dev_flags & SDF_FIRMWARE_VPD_UNIT_SERIAL) { + printk(KERN_ERR "Underlying SCSI device firmware provided VPD" + " Unit Serial, ignoring request\n"); + return -EOPNOTSUPP; + } + + if ((strlen(page) + 1) > INQUIRY_VPD_SERIAL_LEN) { + printk(KERN_ERR "Emulated VPD Unit Serial exceeds" + " INQUIRY_VPD_SERIAL_LEN: %d\n", INQUIRY_VPD_SERIAL_LEN); + return -EOVERFLOW; + } + /* + * Check to see if any active $FABRIC_MOD exports exist. If they + * do exist, fail here as changing this information on the fly + * (underneath the initiator side OS dependent multipath code) + * could cause negative effects. + */ + dev = su_dev->se_dev_ptr; + if ((dev)) { + if (atomic_read(&dev->dev_export_obj.obj_access_count)) { + printk(KERN_ERR "Unable to set VPD Unit Serial while" + " active %d $FABRIC_MOD exports exist\n", + atomic_read(&dev->dev_export_obj.obj_access_count)); + return -EINVAL; + } + } + /* + * This currently assumes ASCII encoding for emulated VPD Unit Serial. + * + * Also, strip any newline added from the userspace + * echo $UUID > $TARGET/$HBA/$STORAGE_OBJECT/wwn/vpd_unit_serial + */ + memset(buf, 0, INQUIRY_VPD_SERIAL_LEN); + snprintf(buf, INQUIRY_VPD_SERIAL_LEN, "%s", page); + snprintf(su_dev->t10_wwn.unit_serial, INQUIRY_VPD_SERIAL_LEN, + "%s", strstrip(buf)); + su_dev->su_dev_flags |= SDF_EMULATED_VPD_UNIT_SERIAL; + + printk(KERN_INFO "Target_Core_ConfigFS: Set emulated VPD Unit Serial:" + " %s\n", su_dev->t10_wwn.unit_serial); + + return count; +} + +SE_DEV_WWN_ATTR(vpd_unit_serial, S_IRUGO | S_IWUSR); + +/* + * VPD page 0x83 Protocol Identifier + */ +static ssize_t target_core_dev_wwn_show_attr_vpd_protocol_identifier( + struct t10_wwn *t10_wwn, + char *page) +{ + struct se_subsystem_dev *se_dev = t10_wwn->t10_sub_dev; + struct se_device *dev; + struct t10_vpd *vpd; + unsigned char buf[VPD_TMP_BUF_SIZE]; + ssize_t len = 0; + + dev = se_dev->se_dev_ptr; + if (!(dev)) + return -ENODEV; + + memset(buf, 0, VPD_TMP_BUF_SIZE); + + spin_lock(&t10_wwn->t10_vpd_lock); + list_for_each_entry(vpd, &t10_wwn->t10_vpd_list, vpd_list) { + if (!(vpd->protocol_identifier_set)) + continue; + + transport_dump_vpd_proto_id(vpd, buf, VPD_TMP_BUF_SIZE); + + if ((len + strlen(buf) > PAGE_SIZE)) + break; + + len += sprintf(page+len, "%s", buf); + } + spin_unlock(&t10_wwn->t10_vpd_lock); + + return len; +} + +static ssize_t target_core_dev_wwn_store_attr_vpd_protocol_identifier( + struct t10_wwn *t10_wwn, + const char *page, + size_t count) +{ + return -ENOSYS; +} + +SE_DEV_WWN_ATTR(vpd_protocol_identifier, S_IRUGO | S_IWUSR); + +/* + * Generic wrapper for dumping VPD identifiers by association. + */ +#define DEF_DEV_WWN_ASSOC_SHOW(_name, _assoc) \ +static ssize_t target_core_dev_wwn_show_attr_##_name( \ + struct t10_wwn *t10_wwn, \ + char *page) \ +{ \ + struct se_subsystem_dev *se_dev = t10_wwn->t10_sub_dev; \ + struct se_device *dev; \ + struct t10_vpd *vpd; \ + unsigned char buf[VPD_TMP_BUF_SIZE]; \ + ssize_t len = 0; \ + \ + dev = se_dev->se_dev_ptr; \ + if (!(dev)) \ + return -ENODEV; \ + \ + spin_lock(&t10_wwn->t10_vpd_lock); \ + list_for_each_entry(vpd, &t10_wwn->t10_vpd_list, vpd_list) { \ + if (vpd->association != _assoc) \ + continue; \ + \ + memset(buf, 0, VPD_TMP_BUF_SIZE); \ + transport_dump_vpd_assoc(vpd, buf, VPD_TMP_BUF_SIZE); \ + if ((len + strlen(buf) > PAGE_SIZE)) \ + break; \ + len += sprintf(page+len, "%s", buf); \ + \ + memset(buf, 0, VPD_TMP_BUF_SIZE); \ + transport_dump_vpd_ident_type(vpd, buf, VPD_TMP_BUF_SIZE); \ + if ((len + strlen(buf) > PAGE_SIZE)) \ + break; \ + len += sprintf(page+len, "%s", buf); \ + \ + memset(buf, 0, VPD_TMP_BUF_SIZE); \ + transport_dump_vpd_ident(vpd, buf, VPD_TMP_BUF_SIZE); \ + if ((len + strlen(buf) > PAGE_SIZE)) \ + break; \ + len += sprintf(page+len, "%s", buf); \ + } \ + spin_unlock(&t10_wwn->t10_vpd_lock); \ + \ + return len; \ +} + +/* + * VPD page 0x83 Assoication: Logical Unit + */ +DEF_DEV_WWN_ASSOC_SHOW(vpd_assoc_logical_unit, 0x00); + +static ssize_t target_core_dev_wwn_store_attr_vpd_assoc_logical_unit( + struct t10_wwn *t10_wwn, + const char *page, + size_t count) +{ + return -ENOSYS; +} + +SE_DEV_WWN_ATTR(vpd_assoc_logical_unit, S_IRUGO | S_IWUSR); + +/* + * VPD page 0x83 Association: Target Port + */ +DEF_DEV_WWN_ASSOC_SHOW(vpd_assoc_target_port, 0x10); + +static ssize_t target_core_dev_wwn_store_attr_vpd_assoc_target_port( + struct t10_wwn *t10_wwn, + const char *page, + size_t count) +{ + return -ENOSYS; +} + +SE_DEV_WWN_ATTR(vpd_assoc_target_port, S_IRUGO | S_IWUSR); + +/* + * VPD page 0x83 Association: SCSI Target Device + */ +DEF_DEV_WWN_ASSOC_SHOW(vpd_assoc_scsi_target_device, 0x20); + +static ssize_t target_core_dev_wwn_store_attr_vpd_assoc_scsi_target_device( + struct t10_wwn *t10_wwn, + const char *page, + size_t count) +{ + return -ENOSYS; +} + +SE_DEV_WWN_ATTR(vpd_assoc_scsi_target_device, S_IRUGO | S_IWUSR); + +CONFIGFS_EATTR_OPS(target_core_dev_wwn, t10_wwn, t10_wwn_group); + +static struct configfs_attribute *target_core_dev_wwn_attrs[] = { + &target_core_dev_wwn_vpd_unit_serial.attr, + &target_core_dev_wwn_vpd_protocol_identifier.attr, + &target_core_dev_wwn_vpd_assoc_logical_unit.attr, + &target_core_dev_wwn_vpd_assoc_target_port.attr, + &target_core_dev_wwn_vpd_assoc_scsi_target_device.attr, + NULL, +}; + +static struct configfs_item_operations target_core_dev_wwn_ops = { + .show_attribute = target_core_dev_wwn_attr_show, + .store_attribute = target_core_dev_wwn_attr_store, +}; + +static struct config_item_type target_core_dev_wwn_cit = { + .ct_item_ops = &target_core_dev_wwn_ops, + .ct_attrs = target_core_dev_wwn_attrs, + .ct_owner = THIS_MODULE, +}; + +/* End functions for struct config_item_type target_core_dev_wwn_cit */ + +/* Start functions for struct config_item_type target_core_dev_pr_cit */ + +CONFIGFS_EATTR_STRUCT(target_core_dev_pr, se_subsystem_dev); +#define SE_DEV_PR_ATTR(_name, _mode) \ +static struct target_core_dev_pr_attribute target_core_dev_pr_##_name = \ + __CONFIGFS_EATTR(_name, _mode, \ + target_core_dev_pr_show_attr_##_name, \ + target_core_dev_pr_store_attr_##_name); + +#define SE_DEV_PR_ATTR_RO(_name); \ +static struct target_core_dev_pr_attribute target_core_dev_pr_##_name = \ + __CONFIGFS_EATTR_RO(_name, \ + target_core_dev_pr_show_attr_##_name); + +/* + * res_holder + */ +static ssize_t target_core_dev_pr_show_spc3_res( + struct se_device *dev, + char *page, + ssize_t *len) +{ + struct se_node_acl *se_nacl; + struct t10_pr_registration *pr_reg; + char i_buf[PR_REG_ISID_ID_LEN]; + int prf_isid; + + memset(i_buf, 0, PR_REG_ISID_ID_LEN); + + spin_lock(&dev->dev_reservation_lock); + pr_reg = dev->dev_pr_res_holder; + if (!(pr_reg)) { + *len += sprintf(page + *len, "No SPC-3 Reservation holder\n"); + spin_unlock(&dev->dev_reservation_lock); + return *len; + } + se_nacl = pr_reg->pr_reg_nacl; + prf_isid = core_pr_dump_initiator_port(pr_reg, &i_buf[0], + PR_REG_ISID_ID_LEN); + + *len += sprintf(page + *len, "SPC-3 Reservation: %s Initiator: %s%s\n", + TPG_TFO(se_nacl->se_tpg)->get_fabric_name(), + se_nacl->initiatorname, (prf_isid) ? &i_buf[0] : ""); + spin_unlock(&dev->dev_reservation_lock); + + return *len; +} + +static ssize_t target_core_dev_pr_show_spc2_res( + struct se_device *dev, + char *page, + ssize_t *len) +{ + struct se_node_acl *se_nacl; + + spin_lock(&dev->dev_reservation_lock); + se_nacl = dev->dev_reserved_node_acl; + if (!(se_nacl)) { + *len += sprintf(page + *len, "No SPC-2 Reservation holder\n"); + spin_unlock(&dev->dev_reservation_lock); + return *len; + } + *len += sprintf(page + *len, "SPC-2 Reservation: %s Initiator: %s\n", + TPG_TFO(se_nacl->se_tpg)->get_fabric_name(), + se_nacl->initiatorname); + spin_unlock(&dev->dev_reservation_lock); + + return *len; +} + +static ssize_t target_core_dev_pr_show_attr_res_holder( + struct se_subsystem_dev *su_dev, + char *page) +{ + ssize_t len = 0; + + if (!(su_dev->se_dev_ptr)) + return -ENODEV; + + switch (T10_RES(su_dev)->res_type) { + case SPC3_PERSISTENT_RESERVATIONS: + target_core_dev_pr_show_spc3_res(su_dev->se_dev_ptr, + page, &len); + break; + case SPC2_RESERVATIONS: + target_core_dev_pr_show_spc2_res(su_dev->se_dev_ptr, + page, &len); + break; + case SPC_PASSTHROUGH: + len += sprintf(page+len, "Passthrough\n"); + break; + default: + len += sprintf(page+len, "Unknown\n"); + break; + } + + return len; +} + +SE_DEV_PR_ATTR_RO(res_holder); + +/* + * res_pr_all_tgt_pts + */ +static ssize_t target_core_dev_pr_show_attr_res_pr_all_tgt_pts( + struct se_subsystem_dev *su_dev, + char *page) +{ + struct se_device *dev; + struct t10_pr_registration *pr_reg; + ssize_t len = 0; + + dev = su_dev->se_dev_ptr; + if (!(dev)) + return -ENODEV; + + if (T10_RES(su_dev)->res_type != SPC3_PERSISTENT_RESERVATIONS) + return len; + + spin_lock(&dev->dev_reservation_lock); + pr_reg = dev->dev_pr_res_holder; + if (!(pr_reg)) { + len = sprintf(page, "No SPC-3 Reservation holder\n"); + spin_unlock(&dev->dev_reservation_lock); + return len; + } + /* + * See All Target Ports (ALL_TG_PT) bit in spcr17, section 6.14.3 + * Basic PERSISTENT RESERVER OUT parameter list, page 290 + */ + if (pr_reg->pr_reg_all_tg_pt) + len = sprintf(page, "SPC-3 Reservation: All Target" + " Ports registration\n"); + else + len = sprintf(page, "SPC-3 Reservation: Single" + " Target Port registration\n"); + spin_unlock(&dev->dev_reservation_lock); + + return len; +} + +SE_DEV_PR_ATTR_RO(res_pr_all_tgt_pts); + +/* + * res_pr_generation + */ +static ssize_t target_core_dev_pr_show_attr_res_pr_generation( + struct se_subsystem_dev *su_dev, + char *page) +{ + if (!(su_dev->se_dev_ptr)) + return -ENODEV; + + if (T10_RES(su_dev)->res_type != SPC3_PERSISTENT_RESERVATIONS) + return 0; + + return sprintf(page, "0x%08x\n", T10_RES(su_dev)->pr_generation); +} + +SE_DEV_PR_ATTR_RO(res_pr_generation); + +/* + * res_pr_holder_tg_port + */ +static ssize_t target_core_dev_pr_show_attr_res_pr_holder_tg_port( + struct se_subsystem_dev *su_dev, + char *page) +{ + struct se_device *dev; + struct se_node_acl *se_nacl; + struct se_lun *lun; + struct se_portal_group *se_tpg; + struct t10_pr_registration *pr_reg; + struct target_core_fabric_ops *tfo; + ssize_t len = 0; + + dev = su_dev->se_dev_ptr; + if (!(dev)) + return -ENODEV; + + if (T10_RES(su_dev)->res_type != SPC3_PERSISTENT_RESERVATIONS) + return len; + + spin_lock(&dev->dev_reservation_lock); + pr_reg = dev->dev_pr_res_holder; + if (!(pr_reg)) { + len = sprintf(page, "No SPC-3 Reservation holder\n"); + spin_unlock(&dev->dev_reservation_lock); + return len; + } + se_nacl = pr_reg->pr_reg_nacl; + se_tpg = se_nacl->se_tpg; + lun = pr_reg->pr_reg_tg_pt_lun; + tfo = TPG_TFO(se_tpg); + + len += sprintf(page+len, "SPC-3 Reservation: %s" + " Target Node Endpoint: %s\n", tfo->get_fabric_name(), + tfo->tpg_get_wwn(se_tpg)); + len += sprintf(page+len, "SPC-3 Reservation: Relative Port" + " Identifer Tag: %hu %s Portal Group Tag: %hu" + " %s Logical Unit: %u\n", lun->lun_sep->sep_rtpi, + tfo->get_fabric_name(), tfo->tpg_get_tag(se_tpg), + tfo->get_fabric_name(), lun->unpacked_lun); + spin_unlock(&dev->dev_reservation_lock); + + return len; +} + +SE_DEV_PR_ATTR_RO(res_pr_holder_tg_port); + +/* + * res_pr_registered_i_pts + */ +static ssize_t target_core_dev_pr_show_attr_res_pr_registered_i_pts( + struct se_subsystem_dev *su_dev, + char *page) +{ + struct target_core_fabric_ops *tfo; + struct t10_pr_registration *pr_reg; + unsigned char buf[384]; + char i_buf[PR_REG_ISID_ID_LEN]; + ssize_t len = 0; + int reg_count = 0, prf_isid; + + if (!(su_dev->se_dev_ptr)) + return -ENODEV; + + if (T10_RES(su_dev)->res_type != SPC3_PERSISTENT_RESERVATIONS) + return len; + + len += sprintf(page+len, "SPC-3 PR Registrations:\n"); + + spin_lock(&T10_RES(su_dev)->registration_lock); + list_for_each_entry(pr_reg, &T10_RES(su_dev)->registration_list, + pr_reg_list) { + + memset(buf, 0, 384); + memset(i_buf, 0, PR_REG_ISID_ID_LEN); + tfo = pr_reg->pr_reg_nacl->se_tpg->se_tpg_tfo; + prf_isid = core_pr_dump_initiator_port(pr_reg, &i_buf[0], + PR_REG_ISID_ID_LEN); + sprintf(buf, "%s Node: %s%s Key: 0x%016Lx PRgen: 0x%08x\n", + tfo->get_fabric_name(), + pr_reg->pr_reg_nacl->initiatorname, (prf_isid) ? + &i_buf[0] : "", pr_reg->pr_res_key, + pr_reg->pr_res_generation); + + if ((len + strlen(buf) > PAGE_SIZE)) + break; + + len += sprintf(page+len, "%s", buf); + reg_count++; + } + spin_unlock(&T10_RES(su_dev)->registration_lock); + + if (!(reg_count)) + len += sprintf(page+len, "None\n"); + + return len; +} + +SE_DEV_PR_ATTR_RO(res_pr_registered_i_pts); + +/* + * res_pr_type + */ +static ssize_t target_core_dev_pr_show_attr_res_pr_type( + struct se_subsystem_dev *su_dev, + char *page) +{ + struct se_device *dev; + struct t10_pr_registration *pr_reg; + ssize_t len = 0; + + dev = su_dev->se_dev_ptr; + if (!(dev)) + return -ENODEV; + + if (T10_RES(su_dev)->res_type != SPC3_PERSISTENT_RESERVATIONS) + return len; + + spin_lock(&dev->dev_reservation_lock); + pr_reg = dev->dev_pr_res_holder; + if (!(pr_reg)) { + len = sprintf(page, "No SPC-3 Reservation holder\n"); + spin_unlock(&dev->dev_reservation_lock); + return len; + } + len = sprintf(page, "SPC-3 Reservation Type: %s\n", + core_scsi3_pr_dump_type(pr_reg->pr_res_type)); + spin_unlock(&dev->dev_reservation_lock); + + return len; +} + +SE_DEV_PR_ATTR_RO(res_pr_type); + +/* + * res_type + */ +static ssize_t target_core_dev_pr_show_attr_res_type( + struct se_subsystem_dev *su_dev, + char *page) +{ + ssize_t len = 0; + + if (!(su_dev->se_dev_ptr)) + return -ENODEV; + + switch (T10_RES(su_dev)->res_type) { + case SPC3_PERSISTENT_RESERVATIONS: + len = sprintf(page, "SPC3_PERSISTENT_RESERVATIONS\n"); + break; + case SPC2_RESERVATIONS: + len = sprintf(page, "SPC2_RESERVATIONS\n"); + break; + case SPC_PASSTHROUGH: + len = sprintf(page, "SPC_PASSTHROUGH\n"); + break; + default: + len = sprintf(page, "UNKNOWN\n"); + break; + } + + return len; +} + +SE_DEV_PR_ATTR_RO(res_type); + +/* + * res_aptpl_active + */ + +static ssize_t target_core_dev_pr_show_attr_res_aptpl_active( + struct se_subsystem_dev *su_dev, + char *page) +{ + if (!(su_dev->se_dev_ptr)) + return -ENODEV; + + if (T10_RES(su_dev)->res_type != SPC3_PERSISTENT_RESERVATIONS) + return 0; + + return sprintf(page, "APTPL Bit Status: %s\n", + (T10_RES(su_dev)->pr_aptpl_active) ? "Activated" : "Disabled"); +} + +SE_DEV_PR_ATTR_RO(res_aptpl_active); + +/* + * res_aptpl_metadata + */ +static ssize_t target_core_dev_pr_show_attr_res_aptpl_metadata( + struct se_subsystem_dev *su_dev, + char *page) +{ + if (!(su_dev->se_dev_ptr)) + return -ENODEV; + + if (T10_RES(su_dev)->res_type != SPC3_PERSISTENT_RESERVATIONS) + return 0; + + return sprintf(page, "Ready to process PR APTPL metadata..\n"); +} + +enum { + Opt_initiator_fabric, Opt_initiator_node, Opt_initiator_sid, + Opt_sa_res_key, Opt_res_holder, Opt_res_type, Opt_res_scope, + Opt_res_all_tg_pt, Opt_mapped_lun, Opt_target_fabric, + Opt_target_node, Opt_tpgt, Opt_port_rtpi, Opt_target_lun, Opt_err +}; + +static match_table_t tokens = { + {Opt_initiator_fabric, "initiator_fabric=%s"}, + {Opt_initiator_node, "initiator_node=%s"}, + {Opt_initiator_sid, "initiator_sid=%s"}, + {Opt_sa_res_key, "sa_res_key=%s"}, + {Opt_res_holder, "res_holder=%d"}, + {Opt_res_type, "res_type=%d"}, + {Opt_res_scope, "res_scope=%d"}, + {Opt_res_all_tg_pt, "res_all_tg_pt=%d"}, + {Opt_mapped_lun, "mapped_lun=%d"}, + {Opt_target_fabric, "target_fabric=%s"}, + {Opt_target_node, "target_node=%s"}, + {Opt_tpgt, "tpgt=%d"}, + {Opt_port_rtpi, "port_rtpi=%d"}, + {Opt_target_lun, "target_lun=%d"}, + {Opt_err, NULL} +}; + +static ssize_t target_core_dev_pr_store_attr_res_aptpl_metadata( + struct se_subsystem_dev *su_dev, + const char *page, + size_t count) +{ + struct se_device *dev; + unsigned char *i_fabric, *t_fabric, *i_port = NULL, *t_port = NULL; + unsigned char *isid = NULL; + char *orig, *ptr, *arg_p, *opts; + substring_t args[MAX_OPT_ARGS]; + unsigned long long tmp_ll; + u64 sa_res_key = 0; + u32 mapped_lun = 0, target_lun = 0; + int ret = -1, res_holder = 0, all_tg_pt = 0, arg, token; + u16 port_rpti = 0, tpgt = 0; + u8 type = 0, scope; + + dev = su_dev->se_dev_ptr; + if (!(dev)) + return -ENODEV; + + if (T10_RES(su_dev)->res_type != SPC3_PERSISTENT_RESERVATIONS) + return 0; + + if (atomic_read(&dev->dev_export_obj.obj_access_count)) { + printk(KERN_INFO "Unable to process APTPL metadata while" + " active fabric exports exist\n"); + return -EINVAL; + } + + opts = kstrdup(page, GFP_KERNEL); + if (!opts) + return -ENOMEM; + + orig = opts; + while ((ptr = strsep(&opts, ",")) != NULL) { + if (!*ptr) + continue; + + token = match_token(ptr, tokens, args); + switch (token) { + case Opt_initiator_fabric: + i_fabric = match_strdup(&args[0]); + break; + case Opt_initiator_node: + i_port = match_strdup(&args[0]); + if (strlen(i_port) > PR_APTPL_MAX_IPORT_LEN) { + printk(KERN_ERR "APTPL metadata initiator_node=" + " exceeds PR_APTPL_MAX_IPORT_LEN: %d\n", + PR_APTPL_MAX_IPORT_LEN); + ret = -EINVAL; + break; + } + break; + case Opt_initiator_sid: + isid = match_strdup(&args[0]); + if (strlen(isid) > PR_REG_ISID_LEN) { + printk(KERN_ERR "APTPL metadata initiator_isid" + "= exceeds PR_REG_ISID_LEN: %d\n", + PR_REG_ISID_LEN); + ret = -EINVAL; + break; + } + break; + case Opt_sa_res_key: + arg_p = match_strdup(&args[0]); + ret = strict_strtoull(arg_p, 0, &tmp_ll); + if (ret < 0) { + printk(KERN_ERR "strict_strtoull() failed for" + " sa_res_key=\n"); + goto out; + } + sa_res_key = (u64)tmp_ll; + break; + /* + * PR APTPL Metadata for Reservation + */ + case Opt_res_holder: + match_int(args, &arg); + res_holder = arg; + break; + case Opt_res_type: + match_int(args, &arg); + type = (u8)arg; + break; + case Opt_res_scope: + match_int(args, &arg); + scope = (u8)arg; + break; + case Opt_res_all_tg_pt: + match_int(args, &arg); + all_tg_pt = (int)arg; + break; + case Opt_mapped_lun: + match_int(args, &arg); + mapped_lun = (u32)arg; + break; + /* + * PR APTPL Metadata for Target Port + */ + case Opt_target_fabric: + t_fabric = match_strdup(&args[0]); + break; + case Opt_target_node: + t_port = match_strdup(&args[0]); + if (strlen(t_port) > PR_APTPL_MAX_TPORT_LEN) { + printk(KERN_ERR "APTPL metadata target_node=" + " exceeds PR_APTPL_MAX_TPORT_LEN: %d\n", + PR_APTPL_MAX_TPORT_LEN); + ret = -EINVAL; + break; + } + break; + case Opt_tpgt: + match_int(args, &arg); + tpgt = (u16)arg; + break; + case Opt_port_rtpi: + match_int(args, &arg); + port_rpti = (u16)arg; + break; + case Opt_target_lun: + match_int(args, &arg); + target_lun = (u32)arg; + break; + default: + break; + } + } + + if (!(i_port) || !(t_port) || !(sa_res_key)) { + printk(KERN_ERR "Illegal parameters for APTPL registration\n"); + ret = -EINVAL; + goto out; + } + + if (res_holder && !(type)) { + printk(KERN_ERR "Illegal PR type: 0x%02x for reservation" + " holder\n", type); + ret = -EINVAL; + goto out; + } + + ret = core_scsi3_alloc_aptpl_registration(T10_RES(su_dev), sa_res_key, + i_port, isid, mapped_lun, t_port, tpgt, target_lun, + res_holder, all_tg_pt, type); +out: + kfree(orig); + return (ret == 0) ? count : ret; +} + +SE_DEV_PR_ATTR(res_aptpl_metadata, S_IRUGO | S_IWUSR); + +CONFIGFS_EATTR_OPS(target_core_dev_pr, se_subsystem_dev, se_dev_pr_group); + +static struct configfs_attribute *target_core_dev_pr_attrs[] = { + &target_core_dev_pr_res_holder.attr, + &target_core_dev_pr_res_pr_all_tgt_pts.attr, + &target_core_dev_pr_res_pr_generation.attr, + &target_core_dev_pr_res_pr_holder_tg_port.attr, + &target_core_dev_pr_res_pr_registered_i_pts.attr, + &target_core_dev_pr_res_pr_type.attr, + &target_core_dev_pr_res_type.attr, + &target_core_dev_pr_res_aptpl_active.attr, + &target_core_dev_pr_res_aptpl_metadata.attr, + NULL, +}; + +static struct configfs_item_operations target_core_dev_pr_ops = { + .show_attribute = target_core_dev_pr_attr_show, + .store_attribute = target_core_dev_pr_attr_store, +}; + +static struct config_item_type target_core_dev_pr_cit = { + .ct_item_ops = &target_core_dev_pr_ops, + .ct_attrs = target_core_dev_pr_attrs, + .ct_owner = THIS_MODULE, +}; + +/* End functions for struct config_item_type target_core_dev_pr_cit */ + +/* Start functions for struct config_item_type target_core_dev_cit */ + +static ssize_t target_core_show_dev_info(void *p, char *page) +{ + struct se_subsystem_dev *se_dev = (struct se_subsystem_dev *)p; + struct se_hba *hba = se_dev->se_dev_hba; + struct se_subsystem_api *t = hba->transport; + int bl = 0; + ssize_t read_bytes = 0; + + if (!(se_dev->se_dev_ptr)) + return -ENODEV; + + transport_dump_dev_state(se_dev->se_dev_ptr, page, &bl); + read_bytes += bl; + read_bytes += t->show_configfs_dev_params(hba, se_dev, page+read_bytes); + return read_bytes; +} + +static struct target_core_configfs_attribute target_core_attr_dev_info = { + .attr = { .ca_owner = THIS_MODULE, + .ca_name = "info", + .ca_mode = S_IRUGO }, + .show = target_core_show_dev_info, + .store = NULL, +}; + +static ssize_t target_core_store_dev_control( + void *p, + const char *page, + size_t count) +{ + struct se_subsystem_dev *se_dev = (struct se_subsystem_dev *)p; + struct se_hba *hba = se_dev->se_dev_hba; + struct se_subsystem_api *t = hba->transport; + + if (!(se_dev->se_dev_su_ptr)) { + printk(KERN_ERR "Unable to locate struct se_subsystem_dev>se" + "_dev_su_ptr\n"); + return -EINVAL; + } + + return t->set_configfs_dev_params(hba, se_dev, page, count); +} + +static struct target_core_configfs_attribute target_core_attr_dev_control = { + .attr = { .ca_owner = THIS_MODULE, + .ca_name = "control", + .ca_mode = S_IWUSR }, + .show = NULL, + .store = target_core_store_dev_control, +}; + +static ssize_t target_core_show_dev_alias(void *p, char *page) +{ + struct se_subsystem_dev *se_dev = (struct se_subsystem_dev *)p; + + if (!(se_dev->su_dev_flags & SDF_USING_ALIAS)) + return 0; + + return snprintf(page, PAGE_SIZE, "%s\n", se_dev->se_dev_alias); +} + +static ssize_t target_core_store_dev_alias( + void *p, + const char *page, + size_t count) +{ + struct se_subsystem_dev *se_dev = (struct se_subsystem_dev *)p; + struct se_hba *hba = se_dev->se_dev_hba; + ssize_t read_bytes; + + if (count > (SE_DEV_ALIAS_LEN-1)) { + printk(KERN_ERR "alias count: %d exceeds" + " SE_DEV_ALIAS_LEN-1: %u\n", (int)count, + SE_DEV_ALIAS_LEN-1); + return -EINVAL; + } + + se_dev->su_dev_flags |= SDF_USING_ALIAS; + read_bytes = snprintf(&se_dev->se_dev_alias[0], SE_DEV_ALIAS_LEN, + "%s", page); + + printk(KERN_INFO "Target_Core_ConfigFS: %s/%s set alias: %s\n", + config_item_name(&hba->hba_group.cg_item), + config_item_name(&se_dev->se_dev_group.cg_item), + se_dev->se_dev_alias); + + return read_bytes; +} + +static struct target_core_configfs_attribute target_core_attr_dev_alias = { + .attr = { .ca_owner = THIS_MODULE, + .ca_name = "alias", + .ca_mode = S_IRUGO | S_IWUSR }, + .show = target_core_show_dev_alias, + .store = target_core_store_dev_alias, +}; + +static ssize_t target_core_show_dev_udev_path(void *p, char *page) +{ + struct se_subsystem_dev *se_dev = (struct se_subsystem_dev *)p; + + if (!(se_dev->su_dev_flags & SDF_USING_UDEV_PATH)) + return 0; + + return snprintf(page, PAGE_SIZE, "%s\n", se_dev->se_dev_udev_path); +} + +static ssize_t target_core_store_dev_udev_path( + void *p, + const char *page, + size_t count) +{ + struct se_subsystem_dev *se_dev = (struct se_subsystem_dev *)p; + struct se_hba *hba = se_dev->se_dev_hba; + ssize_t read_bytes; + + if (count > (SE_UDEV_PATH_LEN-1)) { + printk(KERN_ERR "udev_path count: %d exceeds" + " SE_UDEV_PATH_LEN-1: %u\n", (int)count, + SE_UDEV_PATH_LEN-1); + return -EINVAL; + } + + se_dev->su_dev_flags |= SDF_USING_UDEV_PATH; + read_bytes = snprintf(&se_dev->se_dev_udev_path[0], SE_UDEV_PATH_LEN, + "%s", page); + + printk(KERN_INFO "Target_Core_ConfigFS: %s/%s set udev_path: %s\n", + config_item_name(&hba->hba_group.cg_item), + config_item_name(&se_dev->se_dev_group.cg_item), + se_dev->se_dev_udev_path); + + return read_bytes; +} + +static struct target_core_configfs_attribute target_core_attr_dev_udev_path = { + .attr = { .ca_owner = THIS_MODULE, + .ca_name = "udev_path", + .ca_mode = S_IRUGO | S_IWUSR }, + .show = target_core_show_dev_udev_path, + .store = target_core_store_dev_udev_path, +}; + +static ssize_t target_core_store_dev_enable( + void *p, + const char *page, + size_t count) +{ + struct se_subsystem_dev *se_dev = (struct se_subsystem_dev *)p; + struct se_device *dev; + struct se_hba *hba = se_dev->se_dev_hba; + struct se_subsystem_api *t = hba->transport; + char *ptr; + + ptr = strstr(page, "1"); + if (!(ptr)) { + printk(KERN_ERR "For dev_enable ops, only valid value" + " is \"1\"\n"); + return -EINVAL; + } + if ((se_dev->se_dev_ptr)) { + printk(KERN_ERR "se_dev->se_dev_ptr already set for storage" + " object\n"); + return -EEXIST; + } + + if (t->check_configfs_dev_params(hba, se_dev) < 0) + return -EINVAL; + + dev = t->create_virtdevice(hba, se_dev, se_dev->se_dev_su_ptr); + if (!(dev) || IS_ERR(dev)) + return -EINVAL; + + se_dev->se_dev_ptr = dev; + printk(KERN_INFO "Target_Core_ConfigFS: Registered se_dev->se_dev_ptr:" + " %p\n", se_dev->se_dev_ptr); + + return count; +} + +static struct target_core_configfs_attribute target_core_attr_dev_enable = { + .attr = { .ca_owner = THIS_MODULE, + .ca_name = "enable", + .ca_mode = S_IWUSR }, + .show = NULL, + .store = target_core_store_dev_enable, +}; + +static ssize_t target_core_show_alua_lu_gp(void *p, char *page) +{ + struct se_device *dev; + struct se_subsystem_dev *su_dev = (struct se_subsystem_dev *)p; + struct config_item *lu_ci; + struct t10_alua_lu_gp *lu_gp; + struct t10_alua_lu_gp_member *lu_gp_mem; + ssize_t len = 0; + + dev = su_dev->se_dev_ptr; + if (!(dev)) + return -ENODEV; + + if (T10_ALUA(su_dev)->alua_type != SPC3_ALUA_EMULATED) + return len; + + lu_gp_mem = dev->dev_alua_lu_gp_mem; + if (!(lu_gp_mem)) { + printk(KERN_ERR "NULL struct se_device->dev_alua_lu_gp_mem" + " pointer\n"); + return -EINVAL; + } + + spin_lock(&lu_gp_mem->lu_gp_mem_lock); + lu_gp = lu_gp_mem->lu_gp; + if ((lu_gp)) { + lu_ci = &lu_gp->lu_gp_group.cg_item; + len += sprintf(page, "LU Group Alias: %s\nLU Group ID: %hu\n", + config_item_name(lu_ci), lu_gp->lu_gp_id); + } + spin_unlock(&lu_gp_mem->lu_gp_mem_lock); + + return len; +} + +static ssize_t target_core_store_alua_lu_gp( + void *p, + const char *page, + size_t count) +{ + struct se_device *dev; + struct se_subsystem_dev *su_dev = (struct se_subsystem_dev *)p; + struct se_hba *hba = su_dev->se_dev_hba; + struct t10_alua_lu_gp *lu_gp = NULL, *lu_gp_new = NULL; + struct t10_alua_lu_gp_member *lu_gp_mem; + unsigned char buf[LU_GROUP_NAME_BUF]; + int move = 0; + + dev = su_dev->se_dev_ptr; + if (!(dev)) + return -ENODEV; + + if (T10_ALUA(su_dev)->alua_type != SPC3_ALUA_EMULATED) { + printk(KERN_WARNING "SPC3_ALUA_EMULATED not enabled for %s/%s\n", + config_item_name(&hba->hba_group.cg_item), + config_item_name(&su_dev->se_dev_group.cg_item)); + return -EINVAL; + } + if (count > LU_GROUP_NAME_BUF) { + printk(KERN_ERR "ALUA LU Group Alias too large!\n"); + return -EINVAL; + } + memset(buf, 0, LU_GROUP_NAME_BUF); + memcpy(buf, page, count); + /* + * Any ALUA logical unit alias besides "NULL" means we will be + * making a new group association. + */ + if (strcmp(strstrip(buf), "NULL")) { + /* + * core_alua_get_lu_gp_by_name() will increment reference to + * struct t10_alua_lu_gp. This reference is released with + * core_alua_get_lu_gp_by_name below(). + */ + lu_gp_new = core_alua_get_lu_gp_by_name(strstrip(buf)); + if (!(lu_gp_new)) + return -ENODEV; + } + lu_gp_mem = dev->dev_alua_lu_gp_mem; + if (!(lu_gp_mem)) { + if (lu_gp_new) + core_alua_put_lu_gp_from_name(lu_gp_new); + printk(KERN_ERR "NULL struct se_device->dev_alua_lu_gp_mem" + " pointer\n"); + return -EINVAL; + } + + spin_lock(&lu_gp_mem->lu_gp_mem_lock); + lu_gp = lu_gp_mem->lu_gp; + if ((lu_gp)) { + /* + * Clearing an existing lu_gp association, and replacing + * with NULL + */ + if (!(lu_gp_new)) { + printk(KERN_INFO "Target_Core_ConfigFS: Releasing %s/%s" + " from ALUA LU Group: core/alua/lu_gps/%s, ID:" + " %hu\n", + config_item_name(&hba->hba_group.cg_item), + config_item_name(&su_dev->se_dev_group.cg_item), + config_item_name(&lu_gp->lu_gp_group.cg_item), + lu_gp->lu_gp_id); + + __core_alua_drop_lu_gp_mem(lu_gp_mem, lu_gp); + spin_unlock(&lu_gp_mem->lu_gp_mem_lock); + + return count; + } + /* + * Removing existing association of lu_gp_mem with lu_gp + */ + __core_alua_drop_lu_gp_mem(lu_gp_mem, lu_gp); + move = 1; + } + /* + * Associate lu_gp_mem with lu_gp_new. + */ + __core_alua_attach_lu_gp_mem(lu_gp_mem, lu_gp_new); + spin_unlock(&lu_gp_mem->lu_gp_mem_lock); + + printk(KERN_INFO "Target_Core_ConfigFS: %s %s/%s to ALUA LU Group:" + " core/alua/lu_gps/%s, ID: %hu\n", + (move) ? "Moving" : "Adding", + config_item_name(&hba->hba_group.cg_item), + config_item_name(&su_dev->se_dev_group.cg_item), + config_item_name(&lu_gp_new->lu_gp_group.cg_item), + lu_gp_new->lu_gp_id); + + core_alua_put_lu_gp_from_name(lu_gp_new); + return count; +} + +static struct target_core_configfs_attribute target_core_attr_dev_alua_lu_gp = { + .attr = { .ca_owner = THIS_MODULE, + .ca_name = "alua_lu_gp", + .ca_mode = S_IRUGO | S_IWUSR }, + .show = target_core_show_alua_lu_gp, + .store = target_core_store_alua_lu_gp, +}; + +static struct configfs_attribute *lio_core_dev_attrs[] = { + &target_core_attr_dev_info.attr, + &target_core_attr_dev_control.attr, + &target_core_attr_dev_alias.attr, + &target_core_attr_dev_udev_path.attr, + &target_core_attr_dev_enable.attr, + &target_core_attr_dev_alua_lu_gp.attr, + NULL, +}; + +static void target_core_dev_release(struct config_item *item) +{ + struct se_subsystem_dev *se_dev = container_of(to_config_group(item), + struct se_subsystem_dev, se_dev_group); + struct config_group *dev_cg; + + if (!(se_dev)) + return; + + dev_cg = &se_dev->se_dev_group; + kfree(dev_cg->default_groups); +} + +static ssize_t target_core_dev_show(struct config_item *item, + struct configfs_attribute *attr, + char *page) +{ + struct se_subsystem_dev *se_dev = container_of( + to_config_group(item), struct se_subsystem_dev, + se_dev_group); + struct target_core_configfs_attribute *tc_attr = container_of( + attr, struct target_core_configfs_attribute, attr); + + if (!(tc_attr->show)) + return -EINVAL; + + return tc_attr->show((void *)se_dev, page); +} + +static ssize_t target_core_dev_store(struct config_item *item, + struct configfs_attribute *attr, + const char *page, size_t count) +{ + struct se_subsystem_dev *se_dev = container_of( + to_config_group(item), struct se_subsystem_dev, + se_dev_group); + struct target_core_configfs_attribute *tc_attr = container_of( + attr, struct target_core_configfs_attribute, attr); + + if (!(tc_attr->store)) + return -EINVAL; + + return tc_attr->store((void *)se_dev, page, count); +} + +static struct configfs_item_operations target_core_dev_item_ops = { + .release = target_core_dev_release, + .show_attribute = target_core_dev_show, + .store_attribute = target_core_dev_store, +}; + +static struct config_item_type target_core_dev_cit = { + .ct_item_ops = &target_core_dev_item_ops, + .ct_attrs = lio_core_dev_attrs, + .ct_owner = THIS_MODULE, +}; + +/* End functions for struct config_item_type target_core_dev_cit */ + +/* Start functions for struct config_item_type target_core_alua_lu_gp_cit */ + +CONFIGFS_EATTR_STRUCT(target_core_alua_lu_gp, t10_alua_lu_gp); +#define SE_DEV_ALUA_LU_ATTR(_name, _mode) \ +static struct target_core_alua_lu_gp_attribute \ + target_core_alua_lu_gp_##_name = \ + __CONFIGFS_EATTR(_name, _mode, \ + target_core_alua_lu_gp_show_attr_##_name, \ + target_core_alua_lu_gp_store_attr_##_name); + +#define SE_DEV_ALUA_LU_ATTR_RO(_name) \ +static struct target_core_alua_lu_gp_attribute \ + target_core_alua_lu_gp_##_name = \ + __CONFIGFS_EATTR_RO(_name, \ + target_core_alua_lu_gp_show_attr_##_name); + +/* + * lu_gp_id + */ +static ssize_t target_core_alua_lu_gp_show_attr_lu_gp_id( + struct t10_alua_lu_gp *lu_gp, + char *page) +{ + if (!(lu_gp->lu_gp_valid_id)) + return 0; + + return sprintf(page, "%hu\n", lu_gp->lu_gp_id); +} + +static ssize_t target_core_alua_lu_gp_store_attr_lu_gp_id( + struct t10_alua_lu_gp *lu_gp, + const char *page, + size_t count) +{ + struct config_group *alua_lu_gp_cg = &lu_gp->lu_gp_group; + unsigned long lu_gp_id; + int ret; + + ret = strict_strtoul(page, 0, &lu_gp_id); + if (ret < 0) { + printk(KERN_ERR "strict_strtoul() returned %d for" + " lu_gp_id\n", ret); + return -EINVAL; + } + if (lu_gp_id > 0x0000ffff) { + printk(KERN_ERR "ALUA lu_gp_id: %lu exceeds maximum:" + " 0x0000ffff\n", lu_gp_id); + return -EINVAL; + } + + ret = core_alua_set_lu_gp_id(lu_gp, (u16)lu_gp_id); + if (ret < 0) + return -EINVAL; + + printk(KERN_INFO "Target_Core_ConfigFS: Set ALUA Logical Unit" + " Group: core/alua/lu_gps/%s to ID: %hu\n", + config_item_name(&alua_lu_gp_cg->cg_item), + lu_gp->lu_gp_id); + + return count; +} + +SE_DEV_ALUA_LU_ATTR(lu_gp_id, S_IRUGO | S_IWUSR); + +/* + * members + */ +static ssize_t target_core_alua_lu_gp_show_attr_members( + struct t10_alua_lu_gp *lu_gp, + char *page) +{ + struct se_device *dev; + struct se_hba *hba; + struct se_subsystem_dev *su_dev; + struct t10_alua_lu_gp_member *lu_gp_mem; + ssize_t len = 0, cur_len; + unsigned char buf[LU_GROUP_NAME_BUF]; + + memset(buf, 0, LU_GROUP_NAME_BUF); + + spin_lock(&lu_gp->lu_gp_lock); + list_for_each_entry(lu_gp_mem, &lu_gp->lu_gp_mem_list, lu_gp_mem_list) { + dev = lu_gp_mem->lu_gp_mem_dev; + su_dev = dev->se_sub_dev; + hba = su_dev->se_dev_hba; + + cur_len = snprintf(buf, LU_GROUP_NAME_BUF, "%s/%s\n", + config_item_name(&hba->hba_group.cg_item), + config_item_name(&su_dev->se_dev_group.cg_item)); + cur_len++; /* Extra byte for NULL terminator */ + + if ((cur_len + len) > PAGE_SIZE) { + printk(KERN_WARNING "Ran out of lu_gp_show_attr" + "_members buffer\n"); + break; + } + memcpy(page+len, buf, cur_len); + len += cur_len; + } + spin_unlock(&lu_gp->lu_gp_lock); + + return len; +} + +SE_DEV_ALUA_LU_ATTR_RO(members); + +CONFIGFS_EATTR_OPS(target_core_alua_lu_gp, t10_alua_lu_gp, lu_gp_group); + +static struct configfs_attribute *target_core_alua_lu_gp_attrs[] = { + &target_core_alua_lu_gp_lu_gp_id.attr, + &target_core_alua_lu_gp_members.attr, + NULL, +}; + +static struct configfs_item_operations target_core_alua_lu_gp_ops = { + .show_attribute = target_core_alua_lu_gp_attr_show, + .store_attribute = target_core_alua_lu_gp_attr_store, +}; + +static struct config_item_type target_core_alua_lu_gp_cit = { + .ct_item_ops = &target_core_alua_lu_gp_ops, + .ct_attrs = target_core_alua_lu_gp_attrs, + .ct_owner = THIS_MODULE, +}; + +/* End functions for struct config_item_type target_core_alua_lu_gp_cit */ + +/* Start functions for struct config_item_type target_core_alua_lu_gps_cit */ + +static struct config_group *target_core_alua_create_lu_gp( + struct config_group *group, + const char *name) +{ + struct t10_alua_lu_gp *lu_gp; + struct config_group *alua_lu_gp_cg = NULL; + struct config_item *alua_lu_gp_ci = NULL; + + lu_gp = core_alua_allocate_lu_gp(name, 0); + if (IS_ERR(lu_gp)) + return NULL; + + alua_lu_gp_cg = &lu_gp->lu_gp_group; + alua_lu_gp_ci = &alua_lu_gp_cg->cg_item; + + config_group_init_type_name(alua_lu_gp_cg, name, + &target_core_alua_lu_gp_cit); + + printk(KERN_INFO "Target_Core_ConfigFS: Allocated ALUA Logical Unit" + " Group: core/alua/lu_gps/%s\n", + config_item_name(alua_lu_gp_ci)); + + return alua_lu_gp_cg; + +} + +static void target_core_alua_drop_lu_gp( + struct config_group *group, + struct config_item *item) +{ + struct t10_alua_lu_gp *lu_gp = container_of(to_config_group(item), + struct t10_alua_lu_gp, lu_gp_group); + + printk(KERN_INFO "Target_Core_ConfigFS: Releasing ALUA Logical Unit" + " Group: core/alua/lu_gps/%s, ID: %hu\n", + config_item_name(item), lu_gp->lu_gp_id); + + config_item_put(item); + core_alua_free_lu_gp(lu_gp); +} + +static struct configfs_group_operations target_core_alua_lu_gps_group_ops = { + .make_group = &target_core_alua_create_lu_gp, + .drop_item = &target_core_alua_drop_lu_gp, +}; + +static struct config_item_type target_core_alua_lu_gps_cit = { + .ct_item_ops = NULL, + .ct_group_ops = &target_core_alua_lu_gps_group_ops, + .ct_owner = THIS_MODULE, +}; + +/* End functions for struct config_item_type target_core_alua_lu_gps_cit */ + +/* Start functions for struct config_item_type target_core_alua_tg_pt_gp_cit */ + +CONFIGFS_EATTR_STRUCT(target_core_alua_tg_pt_gp, t10_alua_tg_pt_gp); +#define SE_DEV_ALUA_TG_PT_ATTR(_name, _mode) \ +static struct target_core_alua_tg_pt_gp_attribute \ + target_core_alua_tg_pt_gp_##_name = \ + __CONFIGFS_EATTR(_name, _mode, \ + target_core_alua_tg_pt_gp_show_attr_##_name, \ + target_core_alua_tg_pt_gp_store_attr_##_name); + +#define SE_DEV_ALUA_TG_PT_ATTR_RO(_name) \ +static struct target_core_alua_tg_pt_gp_attribute \ + target_core_alua_tg_pt_gp_##_name = \ + __CONFIGFS_EATTR_RO(_name, \ + target_core_alua_tg_pt_gp_show_attr_##_name); + +/* + * alua_access_state + */ +static ssize_t target_core_alua_tg_pt_gp_show_attr_alua_access_state( + struct t10_alua_tg_pt_gp *tg_pt_gp, + char *page) +{ + return sprintf(page, "%d\n", + atomic_read(&tg_pt_gp->tg_pt_gp_alua_access_state)); +} + +static ssize_t target_core_alua_tg_pt_gp_store_attr_alua_access_state( + struct t10_alua_tg_pt_gp *tg_pt_gp, + const char *page, + size_t count) +{ + struct se_subsystem_dev *su_dev = tg_pt_gp->tg_pt_gp_su_dev; + unsigned long tmp; + int new_state, ret; + + if (!(tg_pt_gp->tg_pt_gp_valid_id)) { + printk(KERN_ERR "Unable to do implict ALUA on non valid" + " tg_pt_gp ID: %hu\n", tg_pt_gp->tg_pt_gp_valid_id); + return -EINVAL; + } + + ret = strict_strtoul(page, 0, &tmp); + if (ret < 0) { + printk("Unable to extract new ALUA access state from" + " %s\n", page); + return -EINVAL; + } + new_state = (int)tmp; + + if (!(tg_pt_gp->tg_pt_gp_alua_access_type & TPGS_IMPLICT_ALUA)) { + printk(KERN_ERR "Unable to process implict configfs ALUA" + " transition while TPGS_IMPLICT_ALUA is diabled\n"); + return -EINVAL; + } + + ret = core_alua_do_port_transition(tg_pt_gp, su_dev->se_dev_ptr, + NULL, NULL, new_state, 0); + return (!ret) ? count : -EINVAL; +} + +SE_DEV_ALUA_TG_PT_ATTR(alua_access_state, S_IRUGO | S_IWUSR); + +/* + * alua_access_status + */ +static ssize_t target_core_alua_tg_pt_gp_show_attr_alua_access_status( + struct t10_alua_tg_pt_gp *tg_pt_gp, + char *page) +{ + return sprintf(page, "%s\n", + core_alua_dump_status(tg_pt_gp->tg_pt_gp_alua_access_status)); +} + +static ssize_t target_core_alua_tg_pt_gp_store_attr_alua_access_status( + struct t10_alua_tg_pt_gp *tg_pt_gp, + const char *page, + size_t count) +{ + unsigned long tmp; + int new_status, ret; + + if (!(tg_pt_gp->tg_pt_gp_valid_id)) { + printk(KERN_ERR "Unable to do set ALUA access status on non" + " valid tg_pt_gp ID: %hu\n", + tg_pt_gp->tg_pt_gp_valid_id); + return -EINVAL; + } + + ret = strict_strtoul(page, 0, &tmp); + if (ret < 0) { + printk(KERN_ERR "Unable to extract new ALUA access status" + " from %s\n", page); + return -EINVAL; + } + new_status = (int)tmp; + + if ((new_status != ALUA_STATUS_NONE) && + (new_status != ALUA_STATUS_ALTERED_BY_EXPLICT_STPG) && + (new_status != ALUA_STATUS_ALTERED_BY_IMPLICT_ALUA)) { + printk(KERN_ERR "Illegal ALUA access status: 0x%02x\n", + new_status); + return -EINVAL; + } + + tg_pt_gp->tg_pt_gp_alua_access_status = new_status; + return count; +} + +SE_DEV_ALUA_TG_PT_ATTR(alua_access_status, S_IRUGO | S_IWUSR); + +/* + * alua_access_type + */ +static ssize_t target_core_alua_tg_pt_gp_show_attr_alua_access_type( + struct t10_alua_tg_pt_gp *tg_pt_gp, + char *page) +{ + return core_alua_show_access_type(tg_pt_gp, page); +} + +static ssize_t target_core_alua_tg_pt_gp_store_attr_alua_access_type( + struct t10_alua_tg_pt_gp *tg_pt_gp, + const char *page, + size_t count) +{ + return core_alua_store_access_type(tg_pt_gp, page, count); +} + +SE_DEV_ALUA_TG_PT_ATTR(alua_access_type, S_IRUGO | S_IWUSR); + +/* + * alua_write_metadata + */ +static ssize_t target_core_alua_tg_pt_gp_show_attr_alua_write_metadata( + struct t10_alua_tg_pt_gp *tg_pt_gp, + char *page) +{ + return sprintf(page, "%d\n", tg_pt_gp->tg_pt_gp_write_metadata); +} + +static ssize_t target_core_alua_tg_pt_gp_store_attr_alua_write_metadata( + struct t10_alua_tg_pt_gp *tg_pt_gp, + const char *page, + size_t count) +{ + unsigned long tmp; + int ret; + + ret = strict_strtoul(page, 0, &tmp); + if (ret < 0) { + printk(KERN_ERR "Unable to extract alua_write_metadata\n"); + return -EINVAL; + } + + if ((tmp != 0) && (tmp != 1)) { + printk(KERN_ERR "Illegal value for alua_write_metadata:" + " %lu\n", tmp); + return -EINVAL; + } + tg_pt_gp->tg_pt_gp_write_metadata = (int)tmp; + + return count; +} + +SE_DEV_ALUA_TG_PT_ATTR(alua_write_metadata, S_IRUGO | S_IWUSR); + + + +/* + * nonop_delay_msecs + */ +static ssize_t target_core_alua_tg_pt_gp_show_attr_nonop_delay_msecs( + struct t10_alua_tg_pt_gp *tg_pt_gp, + char *page) +{ + return core_alua_show_nonop_delay_msecs(tg_pt_gp, page); + +} + +static ssize_t target_core_alua_tg_pt_gp_store_attr_nonop_delay_msecs( + struct t10_alua_tg_pt_gp *tg_pt_gp, + const char *page, + size_t count) +{ + return core_alua_store_nonop_delay_msecs(tg_pt_gp, page, count); +} + +SE_DEV_ALUA_TG_PT_ATTR(nonop_delay_msecs, S_IRUGO | S_IWUSR); + +/* + * trans_delay_msecs + */ +static ssize_t target_core_alua_tg_pt_gp_show_attr_trans_delay_msecs( + struct t10_alua_tg_pt_gp *tg_pt_gp, + char *page) +{ + return core_alua_show_trans_delay_msecs(tg_pt_gp, page); +} + +static ssize_t target_core_alua_tg_pt_gp_store_attr_trans_delay_msecs( + struct t10_alua_tg_pt_gp *tg_pt_gp, + const char *page, + size_t count) +{ + return core_alua_store_trans_delay_msecs(tg_pt_gp, page, count); +} + +SE_DEV_ALUA_TG_PT_ATTR(trans_delay_msecs, S_IRUGO | S_IWUSR); + +/* + * preferred + */ + +static ssize_t target_core_alua_tg_pt_gp_show_attr_preferred( + struct t10_alua_tg_pt_gp *tg_pt_gp, + char *page) +{ + return core_alua_show_preferred_bit(tg_pt_gp, page); +} + +static ssize_t target_core_alua_tg_pt_gp_store_attr_preferred( + struct t10_alua_tg_pt_gp *tg_pt_gp, + const char *page, + size_t count) +{ + return core_alua_store_preferred_bit(tg_pt_gp, page, count); +} + +SE_DEV_ALUA_TG_PT_ATTR(preferred, S_IRUGO | S_IWUSR); + +/* + * tg_pt_gp_id + */ +static ssize_t target_core_alua_tg_pt_gp_show_attr_tg_pt_gp_id( + struct t10_alua_tg_pt_gp *tg_pt_gp, + char *page) +{ + if (!(tg_pt_gp->tg_pt_gp_valid_id)) + return 0; + + return sprintf(page, "%hu\n", tg_pt_gp->tg_pt_gp_id); +} + +static ssize_t target_core_alua_tg_pt_gp_store_attr_tg_pt_gp_id( + struct t10_alua_tg_pt_gp *tg_pt_gp, + const char *page, + size_t count) +{ + struct config_group *alua_tg_pt_gp_cg = &tg_pt_gp->tg_pt_gp_group; + unsigned long tg_pt_gp_id; + int ret; + + ret = strict_strtoul(page, 0, &tg_pt_gp_id); + if (ret < 0) { + printk(KERN_ERR "strict_strtoul() returned %d for" + " tg_pt_gp_id\n", ret); + return -EINVAL; + } + if (tg_pt_gp_id > 0x0000ffff) { + printk(KERN_ERR "ALUA tg_pt_gp_id: %lu exceeds maximum:" + " 0x0000ffff\n", tg_pt_gp_id); + return -EINVAL; + } + + ret = core_alua_set_tg_pt_gp_id(tg_pt_gp, (u16)tg_pt_gp_id); + if (ret < 0) + return -EINVAL; + + printk(KERN_INFO "Target_Core_ConfigFS: Set ALUA Target Port Group: " + "core/alua/tg_pt_gps/%s to ID: %hu\n", + config_item_name(&alua_tg_pt_gp_cg->cg_item), + tg_pt_gp->tg_pt_gp_id); + + return count; +} + +SE_DEV_ALUA_TG_PT_ATTR(tg_pt_gp_id, S_IRUGO | S_IWUSR); + +/* + * members + */ +static ssize_t target_core_alua_tg_pt_gp_show_attr_members( + struct t10_alua_tg_pt_gp *tg_pt_gp, + char *page) +{ + struct se_port *port; + struct se_portal_group *tpg; + struct se_lun *lun; + struct t10_alua_tg_pt_gp_member *tg_pt_gp_mem; + ssize_t len = 0, cur_len; + unsigned char buf[TG_PT_GROUP_NAME_BUF]; + + memset(buf, 0, TG_PT_GROUP_NAME_BUF); + + spin_lock(&tg_pt_gp->tg_pt_gp_lock); + list_for_each_entry(tg_pt_gp_mem, &tg_pt_gp->tg_pt_gp_mem_list, + tg_pt_gp_mem_list) { + port = tg_pt_gp_mem->tg_pt; + tpg = port->sep_tpg; + lun = port->sep_lun; + + cur_len = snprintf(buf, TG_PT_GROUP_NAME_BUF, "%s/%s/tpgt_%hu" + "/%s\n", TPG_TFO(tpg)->get_fabric_name(), + TPG_TFO(tpg)->tpg_get_wwn(tpg), + TPG_TFO(tpg)->tpg_get_tag(tpg), + config_item_name(&lun->lun_group.cg_item)); + cur_len++; /* Extra byte for NULL terminator */ + + if ((cur_len + len) > PAGE_SIZE) { + printk(KERN_WARNING "Ran out of lu_gp_show_attr" + "_members buffer\n"); + break; + } + memcpy(page+len, buf, cur_len); + len += cur_len; + } + spin_unlock(&tg_pt_gp->tg_pt_gp_lock); + + return len; +} + +SE_DEV_ALUA_TG_PT_ATTR_RO(members); + +CONFIGFS_EATTR_OPS(target_core_alua_tg_pt_gp, t10_alua_tg_pt_gp, + tg_pt_gp_group); + +static struct configfs_attribute *target_core_alua_tg_pt_gp_attrs[] = { + &target_core_alua_tg_pt_gp_alua_access_state.attr, + &target_core_alua_tg_pt_gp_alua_access_status.attr, + &target_core_alua_tg_pt_gp_alua_access_type.attr, + &target_core_alua_tg_pt_gp_alua_write_metadata.attr, + &target_core_alua_tg_pt_gp_nonop_delay_msecs.attr, + &target_core_alua_tg_pt_gp_trans_delay_msecs.attr, + &target_core_alua_tg_pt_gp_preferred.attr, + &target_core_alua_tg_pt_gp_tg_pt_gp_id.attr, + &target_core_alua_tg_pt_gp_members.attr, + NULL, +}; + +static struct configfs_item_operations target_core_alua_tg_pt_gp_ops = { + .show_attribute = target_core_alua_tg_pt_gp_attr_show, + .store_attribute = target_core_alua_tg_pt_gp_attr_store, +}; + +static struct config_item_type target_core_alua_tg_pt_gp_cit = { + .ct_item_ops = &target_core_alua_tg_pt_gp_ops, + .ct_attrs = target_core_alua_tg_pt_gp_attrs, + .ct_owner = THIS_MODULE, +}; + +/* End functions for struct config_item_type target_core_alua_tg_pt_gp_cit */ + +/* Start functions for struct config_item_type target_core_alua_tg_pt_gps_cit */ + +static struct config_group *target_core_alua_create_tg_pt_gp( + struct config_group *group, + const char *name) +{ + struct t10_alua *alua = container_of(group, struct t10_alua, + alua_tg_pt_gps_group); + struct t10_alua_tg_pt_gp *tg_pt_gp; + struct se_subsystem_dev *su_dev = alua->t10_sub_dev; + struct config_group *alua_tg_pt_gp_cg = NULL; + struct config_item *alua_tg_pt_gp_ci = NULL; + + tg_pt_gp = core_alua_allocate_tg_pt_gp(su_dev, name, 0); + if (!(tg_pt_gp)) + return NULL; + + alua_tg_pt_gp_cg = &tg_pt_gp->tg_pt_gp_group; + alua_tg_pt_gp_ci = &alua_tg_pt_gp_cg->cg_item; + + config_group_init_type_name(alua_tg_pt_gp_cg, name, + &target_core_alua_tg_pt_gp_cit); + + printk(KERN_INFO "Target_Core_ConfigFS: Allocated ALUA Target Port" + " Group: alua/tg_pt_gps/%s\n", + config_item_name(alua_tg_pt_gp_ci)); + + return alua_tg_pt_gp_cg; +} + +static void target_core_alua_drop_tg_pt_gp( + struct config_group *group, + struct config_item *item) +{ + struct t10_alua_tg_pt_gp *tg_pt_gp = container_of(to_config_group(item), + struct t10_alua_tg_pt_gp, tg_pt_gp_group); + + printk(KERN_INFO "Target_Core_ConfigFS: Releasing ALUA Target Port" + " Group: alua/tg_pt_gps/%s, ID: %hu\n", + config_item_name(item), tg_pt_gp->tg_pt_gp_id); + + config_item_put(item); + core_alua_free_tg_pt_gp(tg_pt_gp); +} + +static struct configfs_group_operations target_core_alua_tg_pt_gps_group_ops = { + .make_group = &target_core_alua_create_tg_pt_gp, + .drop_item = &target_core_alua_drop_tg_pt_gp, +}; + +static struct config_item_type target_core_alua_tg_pt_gps_cit = { + .ct_group_ops = &target_core_alua_tg_pt_gps_group_ops, + .ct_owner = THIS_MODULE, +}; + +/* End functions for struct config_item_type target_core_alua_tg_pt_gps_cit */ + +/* Start functions for struct config_item_type target_core_alua_cit */ + +/* + * target_core_alua_cit is a ConfigFS group that lives under + * /sys/kernel/config/target/core/alua. There are default groups + * core/alua/lu_gps and core/alua/tg_pt_gps that are attached to + * target_core_alua_cit in target_core_init_configfs() below. + */ +static struct config_item_type target_core_alua_cit = { + .ct_item_ops = NULL, + .ct_attrs = NULL, + .ct_owner = THIS_MODULE, +}; + +/* End functions for struct config_item_type target_core_alua_cit */ + +/* Start functions for struct config_item_type target_core_hba_cit */ + +static struct config_group *target_core_make_subdev( + struct config_group *group, + const char *name) +{ + struct t10_alua_tg_pt_gp *tg_pt_gp; + struct se_subsystem_dev *se_dev; + struct se_subsystem_api *t; + struct config_item *hba_ci = &group->cg_item; + struct se_hba *hba = item_to_hba(hba_ci); + struct config_group *dev_cg = NULL, *tg_pt_gp_cg = NULL; + + if (mutex_lock_interruptible(&hba->hba_access_mutex)) + return NULL; + + /* + * Locate the struct se_subsystem_api from parent's struct se_hba. + */ + t = hba->transport; + + se_dev = kzalloc(sizeof(struct se_subsystem_dev), GFP_KERNEL); + if (!se_dev) { + printk(KERN_ERR "Unable to allocate memory for" + " struct se_subsystem_dev\n"); + goto unlock; + } + INIT_LIST_HEAD(&se_dev->g_se_dev_list); + INIT_LIST_HEAD(&se_dev->t10_wwn.t10_vpd_list); + spin_lock_init(&se_dev->t10_wwn.t10_vpd_lock); + INIT_LIST_HEAD(&se_dev->t10_reservation.registration_list); + INIT_LIST_HEAD(&se_dev->t10_reservation.aptpl_reg_list); + spin_lock_init(&se_dev->t10_reservation.registration_lock); + spin_lock_init(&se_dev->t10_reservation.aptpl_reg_lock); + INIT_LIST_HEAD(&se_dev->t10_alua.tg_pt_gps_list); + spin_lock_init(&se_dev->t10_alua.tg_pt_gps_lock); + spin_lock_init(&se_dev->se_dev_lock); + se_dev->t10_reservation.pr_aptpl_buf_len = PR_APTPL_BUF_LEN; + se_dev->t10_wwn.t10_sub_dev = se_dev; + se_dev->t10_alua.t10_sub_dev = se_dev; + se_dev->se_dev_attrib.da_sub_dev = se_dev; + + se_dev->se_dev_hba = hba; + dev_cg = &se_dev->se_dev_group; + + dev_cg->default_groups = kzalloc(sizeof(struct config_group) * 6, + GFP_KERNEL); + if (!(dev_cg->default_groups)) + goto out; + /* + * Set se_dev_su_ptr from struct se_subsystem_api returned void ptr + * for ->allocate_virtdevice() + * + * se_dev->se_dev_ptr will be set after ->create_virtdev() + * has been called successfully in the next level up in the + * configfs tree for device object's struct config_group. + */ + se_dev->se_dev_su_ptr = t->allocate_virtdevice(hba, name); + if (!(se_dev->se_dev_su_ptr)) { + printk(KERN_ERR "Unable to locate subsystem dependent pointer" + " from allocate_virtdevice()\n"); + goto out; + } + spin_lock(&se_global->g_device_lock); + list_add_tail(&se_dev->g_se_dev_list, &se_global->g_se_dev_list); + spin_unlock(&se_global->g_device_lock); + + config_group_init_type_name(&se_dev->se_dev_group, name, + &target_core_dev_cit); + config_group_init_type_name(&se_dev->se_dev_attrib.da_group, "attrib", + &target_core_dev_attrib_cit); + config_group_init_type_name(&se_dev->se_dev_pr_group, "pr", + &target_core_dev_pr_cit); + config_group_init_type_name(&se_dev->t10_wwn.t10_wwn_group, "wwn", + &target_core_dev_wwn_cit); + config_group_init_type_name(&se_dev->t10_alua.alua_tg_pt_gps_group, + "alua", &target_core_alua_tg_pt_gps_cit); + dev_cg->default_groups[0] = &se_dev->se_dev_attrib.da_group; + dev_cg->default_groups[1] = &se_dev->se_dev_pr_group; + dev_cg->default_groups[2] = &se_dev->t10_wwn.t10_wwn_group; + dev_cg->default_groups[3] = &se_dev->t10_alua.alua_tg_pt_gps_group; + dev_cg->default_groups[4] = NULL; + /* + * Add core/$HBA/$DEV/alua/tg_pt_gps/default_tg_pt_gp + */ + tg_pt_gp = core_alua_allocate_tg_pt_gp(se_dev, "default_tg_pt_gp", 1); + if (!(tg_pt_gp)) + goto out; + + tg_pt_gp_cg = &T10_ALUA(se_dev)->alua_tg_pt_gps_group; + tg_pt_gp_cg->default_groups = kzalloc(sizeof(struct config_group) * 2, + GFP_KERNEL); + if (!(tg_pt_gp_cg->default_groups)) { + printk(KERN_ERR "Unable to allocate tg_pt_gp_cg->" + "default_groups\n"); + goto out; + } + + config_group_init_type_name(&tg_pt_gp->tg_pt_gp_group, + "default_tg_pt_gp", &target_core_alua_tg_pt_gp_cit); + tg_pt_gp_cg->default_groups[0] = &tg_pt_gp->tg_pt_gp_group; + tg_pt_gp_cg->default_groups[1] = NULL; + T10_ALUA(se_dev)->default_tg_pt_gp = tg_pt_gp; + + printk(KERN_INFO "Target_Core_ConfigFS: Allocated struct se_subsystem_dev:" + " %p se_dev_su_ptr: %p\n", se_dev, se_dev->se_dev_su_ptr); + + mutex_unlock(&hba->hba_access_mutex); + return &se_dev->se_dev_group; +out: + if (T10_ALUA(se_dev)->default_tg_pt_gp) { + core_alua_free_tg_pt_gp(T10_ALUA(se_dev)->default_tg_pt_gp); + T10_ALUA(se_dev)->default_tg_pt_gp = NULL; + } + if (tg_pt_gp_cg) + kfree(tg_pt_gp_cg->default_groups); + if (dev_cg) + kfree(dev_cg->default_groups); + if (se_dev->se_dev_su_ptr) + t->free_device(se_dev->se_dev_su_ptr); + kfree(se_dev); +unlock: + mutex_unlock(&hba->hba_access_mutex); + return NULL; +} + +static void target_core_drop_subdev( + struct config_group *group, + struct config_item *item) +{ + struct se_subsystem_dev *se_dev = container_of(to_config_group(item), + struct se_subsystem_dev, se_dev_group); + struct se_hba *hba; + struct se_subsystem_api *t; + struct config_item *df_item; + struct config_group *dev_cg, *tg_pt_gp_cg; + int i, ret; + + hba = item_to_hba(&se_dev->se_dev_hba->hba_group.cg_item); + + if (mutex_lock_interruptible(&hba->hba_access_mutex)) + goto out; + + t = hba->transport; + + spin_lock(&se_global->g_device_lock); + list_del(&se_dev->g_se_dev_list); + spin_unlock(&se_global->g_device_lock); + + tg_pt_gp_cg = &T10_ALUA(se_dev)->alua_tg_pt_gps_group; + for (i = 0; tg_pt_gp_cg->default_groups[i]; i++) { + df_item = &tg_pt_gp_cg->default_groups[i]->cg_item; + tg_pt_gp_cg->default_groups[i] = NULL; + config_item_put(df_item); + } + kfree(tg_pt_gp_cg->default_groups); + core_alua_free_tg_pt_gp(T10_ALUA(se_dev)->default_tg_pt_gp); + T10_ALUA(se_dev)->default_tg_pt_gp = NULL; + + dev_cg = &se_dev->se_dev_group; + for (i = 0; dev_cg->default_groups[i]; i++) { + df_item = &dev_cg->default_groups[i]->cg_item; + dev_cg->default_groups[i] = NULL; + config_item_put(df_item); + } + + config_item_put(item); + /* + * This pointer will set when the storage is enabled with: + * `echo 1 > $CONFIGFS/core/$HBA/$DEV/dev_enable` + */ + if (se_dev->se_dev_ptr) { + printk(KERN_INFO "Target_Core_ConfigFS: Calling se_free_" + "virtual_device() for se_dev_ptr: %p\n", + se_dev->se_dev_ptr); + + ret = se_free_virtual_device(se_dev->se_dev_ptr, hba); + if (ret < 0) + goto hba_out; + } else { + /* + * Release struct se_subsystem_dev->se_dev_su_ptr.. + */ + printk(KERN_INFO "Target_Core_ConfigFS: Calling t->free_" + "device() for se_dev_su_ptr: %p\n", + se_dev->se_dev_su_ptr); + + t->free_device(se_dev->se_dev_su_ptr); + } + + printk(KERN_INFO "Target_Core_ConfigFS: Deallocating se_subsystem" + "_dev_t: %p\n", se_dev); + +hba_out: + mutex_unlock(&hba->hba_access_mutex); +out: + kfree(se_dev); +} + +static struct configfs_group_operations target_core_hba_group_ops = { + .make_group = target_core_make_subdev, + .drop_item = target_core_drop_subdev, +}; + +CONFIGFS_EATTR_STRUCT(target_core_hba, se_hba); +#define SE_HBA_ATTR(_name, _mode) \ +static struct target_core_hba_attribute \ + target_core_hba_##_name = \ + __CONFIGFS_EATTR(_name, _mode, \ + target_core_hba_show_attr_##_name, \ + target_core_hba_store_attr_##_name); + +#define SE_HBA_ATTR_RO(_name) \ +static struct target_core_hba_attribute \ + target_core_hba_##_name = \ + __CONFIGFS_EATTR_RO(_name, \ + target_core_hba_show_attr_##_name); + +static ssize_t target_core_hba_show_attr_hba_info( + struct se_hba *hba, + char *page) +{ + return sprintf(page, "HBA Index: %d plugin: %s version: %s\n", + hba->hba_id, hba->transport->name, + TARGET_CORE_CONFIGFS_VERSION); +} + +SE_HBA_ATTR_RO(hba_info); + +static ssize_t target_core_hba_show_attr_hba_mode(struct se_hba *hba, + char *page) +{ + int hba_mode = 0; + + if (hba->hba_flags & HBA_FLAGS_PSCSI_MODE) + hba_mode = 1; + + return sprintf(page, "%d\n", hba_mode); +} + +static ssize_t target_core_hba_store_attr_hba_mode(struct se_hba *hba, + const char *page, size_t count) +{ + struct se_subsystem_api *transport = hba->transport; + unsigned long mode_flag; + int ret; + + if (transport->pmode_enable_hba == NULL) + return -EINVAL; + + ret = strict_strtoul(page, 0, &mode_flag); + if (ret < 0) { + printk(KERN_ERR "Unable to extract hba mode flag: %d\n", ret); + return -EINVAL; + } + + spin_lock(&hba->device_lock); + if (!(list_empty(&hba->hba_dev_list))) { + printk(KERN_ERR "Unable to set hba_mode with active devices\n"); + spin_unlock(&hba->device_lock); + return -EINVAL; + } + spin_unlock(&hba->device_lock); + + ret = transport->pmode_enable_hba(hba, mode_flag); + if (ret < 0) + return -EINVAL; + if (ret > 0) + hba->hba_flags |= HBA_FLAGS_PSCSI_MODE; + else if (ret == 0) + hba->hba_flags &= ~HBA_FLAGS_PSCSI_MODE; + + return count; +} + +SE_HBA_ATTR(hba_mode, S_IRUGO | S_IWUSR); + +CONFIGFS_EATTR_OPS(target_core_hba, se_hba, hba_group); + +static struct configfs_attribute *target_core_hba_attrs[] = { + &target_core_hba_hba_info.attr, + &target_core_hba_hba_mode.attr, + NULL, +}; + +static struct configfs_item_operations target_core_hba_item_ops = { + .show_attribute = target_core_hba_attr_show, + .store_attribute = target_core_hba_attr_store, +}; + +static struct config_item_type target_core_hba_cit = { + .ct_item_ops = &target_core_hba_item_ops, + .ct_group_ops = &target_core_hba_group_ops, + .ct_attrs = target_core_hba_attrs, + .ct_owner = THIS_MODULE, +}; + +static struct config_group *target_core_call_addhbatotarget( + struct config_group *group, + const char *name) +{ + char *se_plugin_str, *str, *str2; + struct se_hba *hba; + char buf[TARGET_CORE_NAME_MAX_LEN]; + unsigned long plugin_dep_id = 0; + int ret; + + memset(buf, 0, TARGET_CORE_NAME_MAX_LEN); + if (strlen(name) > TARGET_CORE_NAME_MAX_LEN) { + printk(KERN_ERR "Passed *name strlen(): %d exceeds" + " TARGET_CORE_NAME_MAX_LEN: %d\n", (int)strlen(name), + TARGET_CORE_NAME_MAX_LEN); + return ERR_PTR(-ENAMETOOLONG); + } + snprintf(buf, TARGET_CORE_NAME_MAX_LEN, "%s", name); + + str = strstr(buf, "_"); + if (!(str)) { + printk(KERN_ERR "Unable to locate \"_\" for $SUBSYSTEM_PLUGIN_$HOST_ID\n"); + return ERR_PTR(-EINVAL); + } + se_plugin_str = buf; + /* + * Special case for subsystem plugins that have "_" in their names. + * Namely rd_direct and rd_mcp.. + */ + str2 = strstr(str+1, "_"); + if ((str2)) { + *str2 = '\0'; /* Terminate for *se_plugin_str */ + str2++; /* Skip to start of plugin dependent ID */ + str = str2; + } else { + *str = '\0'; /* Terminate for *se_plugin_str */ + str++; /* Skip to start of plugin dependent ID */ + } + + ret = strict_strtoul(str, 0, &plugin_dep_id); + if (ret < 0) { + printk(KERN_ERR "strict_strtoul() returned %d for" + " plugin_dep_id\n", ret); + return ERR_PTR(-EINVAL); + } + /* + * Load up TCM subsystem plugins if they have not already been loaded. + */ + if (transport_subsystem_check_init() < 0) + return ERR_PTR(-EINVAL); + + hba = core_alloc_hba(se_plugin_str, plugin_dep_id, 0); + if (IS_ERR(hba)) + return ERR_CAST(hba); + + config_group_init_type_name(&hba->hba_group, name, + &target_core_hba_cit); + + return &hba->hba_group; +} + +static void target_core_call_delhbafromtarget( + struct config_group *group, + struct config_item *item) +{ + struct se_hba *hba = item_to_hba(item); + + config_item_put(item); + core_delete_hba(hba); +} + +static struct configfs_group_operations target_core_group_ops = { + .make_group = target_core_call_addhbatotarget, + .drop_item = target_core_call_delhbafromtarget, +}; + +static struct config_item_type target_core_cit = { + .ct_item_ops = NULL, + .ct_group_ops = &target_core_group_ops, + .ct_attrs = NULL, + .ct_owner = THIS_MODULE, +}; + +/* Stop functions for struct config_item_type target_core_hba_cit */ + +static int target_core_init_configfs(void) +{ + struct config_group *target_cg, *hba_cg = NULL, *alua_cg = NULL; + struct config_group *lu_gp_cg = NULL; + struct configfs_subsystem *subsys; + struct proc_dir_entry *scsi_target_proc = NULL; + struct t10_alua_lu_gp *lu_gp; + int ret; + + printk(KERN_INFO "TARGET_CORE[0]: Loading Generic Kernel Storage" + " Engine: %s on %s/%s on "UTS_RELEASE"\n", + TARGET_CORE_VERSION, utsname()->sysname, utsname()->machine); + + subsys = target_core_subsystem[0]; + config_group_init(&subsys->su_group); + mutex_init(&subsys->su_mutex); + + INIT_LIST_HEAD(&g_tf_list); + mutex_init(&g_tf_lock); + init_scsi_index_table(); + ret = init_se_global(); + if (ret < 0) + return -1; + /* + * Create $CONFIGFS/target/core default group for HBA <-> Storage Object + * and ALUA Logical Unit Group and Target Port Group infrastructure. + */ + target_cg = &subsys->su_group; + target_cg->default_groups = kzalloc(sizeof(struct config_group) * 2, + GFP_KERNEL); + if (!(target_cg->default_groups)) { + printk(KERN_ERR "Unable to allocate target_cg->default_groups\n"); + goto out_global; + } + + config_group_init_type_name(&se_global->target_core_hbagroup, + "core", &target_core_cit); + target_cg->default_groups[0] = &se_global->target_core_hbagroup; + target_cg->default_groups[1] = NULL; + /* + * Create ALUA infrastructure under /sys/kernel/config/target/core/alua/ + */ + hba_cg = &se_global->target_core_hbagroup; + hba_cg->default_groups = kzalloc(sizeof(struct config_group) * 2, + GFP_KERNEL); + if (!(hba_cg->default_groups)) { + printk(KERN_ERR "Unable to allocate hba_cg->default_groups\n"); + goto out_global; + } + config_group_init_type_name(&se_global->alua_group, + "alua", &target_core_alua_cit); + hba_cg->default_groups[0] = &se_global->alua_group; + hba_cg->default_groups[1] = NULL; + /* + * Add ALUA Logical Unit Group and Target Port Group ConfigFS + * groups under /sys/kernel/config/target/core/alua/ + */ + alua_cg = &se_global->alua_group; + alua_cg->default_groups = kzalloc(sizeof(struct config_group) * 2, + GFP_KERNEL); + if (!(alua_cg->default_groups)) { + printk(KERN_ERR "Unable to allocate alua_cg->default_groups\n"); + goto out_global; + } + + config_group_init_type_name(&se_global->alua_lu_gps_group, + "lu_gps", &target_core_alua_lu_gps_cit); + alua_cg->default_groups[0] = &se_global->alua_lu_gps_group; + alua_cg->default_groups[1] = NULL; + /* + * Add core/alua/lu_gps/default_lu_gp + */ + lu_gp = core_alua_allocate_lu_gp("default_lu_gp", 1); + if (IS_ERR(lu_gp)) + goto out_global; + + lu_gp_cg = &se_global->alua_lu_gps_group; + lu_gp_cg->default_groups = kzalloc(sizeof(struct config_group) * 2, + GFP_KERNEL); + if (!(lu_gp_cg->default_groups)) { + printk(KERN_ERR "Unable to allocate lu_gp_cg->default_groups\n"); + goto out_global; + } + + config_group_init_type_name(&lu_gp->lu_gp_group, "default_lu_gp", + &target_core_alua_lu_gp_cit); + lu_gp_cg->default_groups[0] = &lu_gp->lu_gp_group; + lu_gp_cg->default_groups[1] = NULL; + se_global->default_lu_gp = lu_gp; + /* + * Register the target_core_mod subsystem with configfs. + */ + ret = configfs_register_subsystem(subsys); + if (ret < 0) { + printk(KERN_ERR "Error %d while registering subsystem %s\n", + ret, subsys->su_group.cg_item.ci_namebuf); + goto out_global; + } + printk(KERN_INFO "TARGET_CORE[0]: Initialized ConfigFS Fabric" + " Infrastructure: "TARGET_CORE_CONFIGFS_VERSION" on %s/%s" + " on "UTS_RELEASE"\n", utsname()->sysname, utsname()->machine); + /* + * Register built-in RAMDISK subsystem logic for virtual LUN 0 + */ + ret = rd_module_init(); + if (ret < 0) + goto out; + + if (core_dev_setup_virtual_lun0() < 0) + goto out; + + scsi_target_proc = proc_mkdir("scsi_target", 0); + if (!(scsi_target_proc)) { + printk(KERN_ERR "proc_mkdir(scsi_target, 0) failed\n"); + goto out; + } + ret = init_scsi_target_mib(); + if (ret < 0) + goto out; + + return 0; + +out: + configfs_unregister_subsystem(subsys); + if (scsi_target_proc) + remove_proc_entry("scsi_target", 0); + core_dev_release_virtual_lun0(); + rd_module_exit(); +out_global: + if (se_global->default_lu_gp) { + core_alua_free_lu_gp(se_global->default_lu_gp); + se_global->default_lu_gp = NULL; + } + if (lu_gp_cg) + kfree(lu_gp_cg->default_groups); + if (alua_cg) + kfree(alua_cg->default_groups); + if (hba_cg) + kfree(hba_cg->default_groups); + kfree(target_cg->default_groups); + release_se_global(); + return -1; +} + +static void target_core_exit_configfs(void) +{ + struct configfs_subsystem *subsys; + struct config_group *hba_cg, *alua_cg, *lu_gp_cg; + struct config_item *item; + int i; + + se_global->in_shutdown = 1; + subsys = target_core_subsystem[0]; + + lu_gp_cg = &se_global->alua_lu_gps_group; + for (i = 0; lu_gp_cg->default_groups[i]; i++) { + item = &lu_gp_cg->default_groups[i]->cg_item; + lu_gp_cg->default_groups[i] = NULL; + config_item_put(item); + } + kfree(lu_gp_cg->default_groups); + core_alua_free_lu_gp(se_global->default_lu_gp); + se_global->default_lu_gp = NULL; + + alua_cg = &se_global->alua_group; + for (i = 0; alua_cg->default_groups[i]; i++) { + item = &alua_cg->default_groups[i]->cg_item; + alua_cg->default_groups[i] = NULL; + config_item_put(item); + } + kfree(alua_cg->default_groups); + + hba_cg = &se_global->target_core_hbagroup; + for (i = 0; hba_cg->default_groups[i]; i++) { + item = &hba_cg->default_groups[i]->cg_item; + hba_cg->default_groups[i] = NULL; + config_item_put(item); + } + kfree(hba_cg->default_groups); + + for (i = 0; subsys->su_group.default_groups[i]; i++) { + item = &subsys->su_group.default_groups[i]->cg_item; + subsys->su_group.default_groups[i] = NULL; + config_item_put(item); + } + kfree(subsys->su_group.default_groups); + + configfs_unregister_subsystem(subsys); + printk(KERN_INFO "TARGET_CORE[0]: Released ConfigFS Fabric" + " Infrastructure\n"); + + remove_scsi_target_mib(); + remove_proc_entry("scsi_target", 0); + core_dev_release_virtual_lun0(); + rd_module_exit(); + release_se_global(); + + return; +} + +MODULE_DESCRIPTION("Target_Core_Mod/ConfigFS"); +MODULE_AUTHOR("nab@Linux-iSCSI.org"); +MODULE_LICENSE("GPL"); + +module_init(target_core_init_configfs); +module_exit(target_core_exit_configfs); diff --git a/drivers/target/target_core_device.c b/drivers/target/target_core_device.c new file mode 100644 index 000000000000..317ce58d426d --- /dev/null +++ b/drivers/target/target_core_device.c @@ -0,0 +1,1694 @@ +/******************************************************************************* + * Filename: target_core_device.c (based on iscsi_target_device.c) + * + * This file contains the iSCSI Virtual Device and Disk Transport + * agnostic related functions. + * + * Copyright (c) 2003, 2004, 2005 PyX Technologies, Inc. + * Copyright (c) 2005-2006 SBE, Inc. All Rights Reserved. + * Copyright (c) 2007-2010 Rising Tide Systems + * Copyright (c) 2008-2010 Linux-iSCSI.org + * + * Nicholas A. Bellinger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + ******************************************************************************/ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "target_core_alua.h" +#include "target_core_hba.h" +#include "target_core_pr.h" +#include "target_core_ua.h" + +static void se_dev_start(struct se_device *dev); +static void se_dev_stop(struct se_device *dev); + +int transport_get_lun_for_cmd( + struct se_cmd *se_cmd, + unsigned char *cdb, + u32 unpacked_lun) +{ + struct se_dev_entry *deve; + struct se_lun *se_lun = NULL; + struct se_session *se_sess = SE_SESS(se_cmd); + unsigned long flags; + int read_only = 0; + + spin_lock_irq(&SE_NODE_ACL(se_sess)->device_list_lock); + deve = se_cmd->se_deve = + &SE_NODE_ACL(se_sess)->device_list[unpacked_lun]; + if (deve->lun_flags & TRANSPORT_LUNFLAGS_INITIATOR_ACCESS) { + if (se_cmd) { + deve->total_cmds++; + deve->total_bytes += se_cmd->data_length; + + if (se_cmd->data_direction == DMA_TO_DEVICE) { + if (deve->lun_flags & + TRANSPORT_LUNFLAGS_READ_ONLY) { + read_only = 1; + goto out; + } + deve->write_bytes += se_cmd->data_length; + } else if (se_cmd->data_direction == + DMA_FROM_DEVICE) { + deve->read_bytes += se_cmd->data_length; + } + } + deve->deve_cmds++; + + se_lun = se_cmd->se_lun = deve->se_lun; + se_cmd->pr_res_key = deve->pr_res_key; + se_cmd->orig_fe_lun = unpacked_lun; + se_cmd->se_orig_obj_ptr = SE_LUN(se_cmd)->lun_se_dev; + se_cmd->se_cmd_flags |= SCF_SE_LUN_CMD; + } +out: + spin_unlock_irq(&SE_NODE_ACL(se_sess)->device_list_lock); + + if (!se_lun) { + if (read_only) { + se_cmd->scsi_sense_reason = TCM_WRITE_PROTECTED; + se_cmd->se_cmd_flags |= SCF_SCSI_CDB_EXCEPTION; + printk("TARGET_CORE[%s]: Detected WRITE_PROTECTED LUN" + " Access for 0x%08x\n", + CMD_TFO(se_cmd)->get_fabric_name(), + unpacked_lun); + return -1; + } else { + /* + * Use the se_portal_group->tpg_virt_lun0 to allow for + * REPORT_LUNS, et al to be returned when no active + * MappedLUN=0 exists for this Initiator Port. + */ + if (unpacked_lun != 0) { + se_cmd->scsi_sense_reason = TCM_NON_EXISTENT_LUN; + se_cmd->se_cmd_flags |= SCF_SCSI_CDB_EXCEPTION; + printk("TARGET_CORE[%s]: Detected NON_EXISTENT_LUN" + " Access for 0x%08x\n", + CMD_TFO(se_cmd)->get_fabric_name(), + unpacked_lun); + return -1; + } + /* + * Force WRITE PROTECT for virtual LUN 0 + */ + if ((se_cmd->data_direction != DMA_FROM_DEVICE) && + (se_cmd->data_direction != DMA_NONE)) { + se_cmd->scsi_sense_reason = TCM_WRITE_PROTECTED; + se_cmd->se_cmd_flags |= SCF_SCSI_CDB_EXCEPTION; + return -1; + } +#if 0 + printk("TARGET_CORE[%s]: Using virtual LUN0! :-)\n", + CMD_TFO(se_cmd)->get_fabric_name()); +#endif + se_lun = se_cmd->se_lun = &se_sess->se_tpg->tpg_virt_lun0; + se_cmd->orig_fe_lun = 0; + se_cmd->se_orig_obj_ptr = SE_LUN(se_cmd)->lun_se_dev; + se_cmd->se_cmd_flags |= SCF_SE_LUN_CMD; + } + } + /* + * Determine if the struct se_lun is online. + */ +/* #warning FIXME: Check for LUN_RESET + UNIT Attention */ + if (se_dev_check_online(se_lun->lun_se_dev) != 0) { + se_cmd->scsi_sense_reason = TCM_NON_EXISTENT_LUN; + se_cmd->se_cmd_flags |= SCF_SCSI_CDB_EXCEPTION; + return -1; + } + + { + struct se_device *dev = se_lun->lun_se_dev; + spin_lock(&dev->stats_lock); + dev->num_cmds++; + if (se_cmd->data_direction == DMA_TO_DEVICE) + dev->write_bytes += se_cmd->data_length; + else if (se_cmd->data_direction == DMA_FROM_DEVICE) + dev->read_bytes += se_cmd->data_length; + spin_unlock(&dev->stats_lock); + } + + /* + * Add the iscsi_cmd_t to the struct se_lun's cmd list. This list is used + * for tracking state of struct se_cmds during LUN shutdown events. + */ + spin_lock_irqsave(&se_lun->lun_cmd_lock, flags); + list_add_tail(&se_cmd->se_lun_list, &se_lun->lun_cmd_list); + atomic_set(&T_TASK(se_cmd)->transport_lun_active, 1); +#if 0 + printk(KERN_INFO "Adding ITT: 0x%08x to LUN LIST[%d]\n", + CMD_TFO(se_cmd)->get_task_tag(se_cmd), se_lun->unpacked_lun); +#endif + spin_unlock_irqrestore(&se_lun->lun_cmd_lock, flags); + + return 0; +} +EXPORT_SYMBOL(transport_get_lun_for_cmd); + +int transport_get_lun_for_tmr( + struct se_cmd *se_cmd, + u32 unpacked_lun) +{ + struct se_device *dev = NULL; + struct se_dev_entry *deve; + struct se_lun *se_lun = NULL; + struct se_session *se_sess = SE_SESS(se_cmd); + struct se_tmr_req *se_tmr = se_cmd->se_tmr_req; + + spin_lock_irq(&SE_NODE_ACL(se_sess)->device_list_lock); + deve = se_cmd->se_deve = + &SE_NODE_ACL(se_sess)->device_list[unpacked_lun]; + if (deve->lun_flags & TRANSPORT_LUNFLAGS_INITIATOR_ACCESS) { + se_lun = se_cmd->se_lun = se_tmr->tmr_lun = deve->se_lun; + dev = se_tmr->tmr_dev = se_lun->lun_se_dev; + se_cmd->pr_res_key = deve->pr_res_key; + se_cmd->orig_fe_lun = unpacked_lun; + se_cmd->se_orig_obj_ptr = SE_LUN(se_cmd)->lun_se_dev; +/* se_cmd->se_cmd_flags |= SCF_SE_LUN_CMD; */ + } + spin_unlock_irq(&SE_NODE_ACL(se_sess)->device_list_lock); + + if (!se_lun) { + printk(KERN_INFO "TARGET_CORE[%s]: Detected NON_EXISTENT_LUN" + " Access for 0x%08x\n", + CMD_TFO(se_cmd)->get_fabric_name(), + unpacked_lun); + se_cmd->se_cmd_flags |= SCF_SCSI_CDB_EXCEPTION; + return -1; + } + /* + * Determine if the struct se_lun is online. + */ +/* #warning FIXME: Check for LUN_RESET + UNIT Attention */ + if (se_dev_check_online(se_lun->lun_se_dev) != 0) { + se_cmd->se_cmd_flags |= SCF_SCSI_CDB_EXCEPTION; + return -1; + } + + spin_lock(&dev->se_tmr_lock); + list_add_tail(&se_tmr->tmr_list, &dev->dev_tmr_list); + spin_unlock(&dev->se_tmr_lock); + + return 0; +} +EXPORT_SYMBOL(transport_get_lun_for_tmr); + +/* + * This function is called from core_scsi3_emulate_pro_register_and_move() + * and core_scsi3_decode_spec_i_port(), and will increment &deve->pr_ref_count + * when a matching rtpi is found. + */ +struct se_dev_entry *core_get_se_deve_from_rtpi( + struct se_node_acl *nacl, + u16 rtpi) +{ + struct se_dev_entry *deve; + struct se_lun *lun; + struct se_port *port; + struct se_portal_group *tpg = nacl->se_tpg; + u32 i; + + spin_lock_irq(&nacl->device_list_lock); + for (i = 0; i < TRANSPORT_MAX_LUNS_PER_TPG; i++) { + deve = &nacl->device_list[i]; + + if (!(deve->lun_flags & TRANSPORT_LUNFLAGS_INITIATOR_ACCESS)) + continue; + + lun = deve->se_lun; + if (!(lun)) { + printk(KERN_ERR "%s device entries device pointer is" + " NULL, but Initiator has access.\n", + TPG_TFO(tpg)->get_fabric_name()); + continue; + } + port = lun->lun_sep; + if (!(port)) { + printk(KERN_ERR "%s device entries device pointer is" + " NULL, but Initiator has access.\n", + TPG_TFO(tpg)->get_fabric_name()); + continue; + } + if (port->sep_rtpi != rtpi) + continue; + + atomic_inc(&deve->pr_ref_count); + smp_mb__after_atomic_inc(); + spin_unlock_irq(&nacl->device_list_lock); + + return deve; + } + spin_unlock_irq(&nacl->device_list_lock); + + return NULL; +} + +int core_free_device_list_for_node( + struct se_node_acl *nacl, + struct se_portal_group *tpg) +{ + struct se_dev_entry *deve; + struct se_lun *lun; + u32 i; + + if (!nacl->device_list) + return 0; + + spin_lock_irq(&nacl->device_list_lock); + for (i = 0; i < TRANSPORT_MAX_LUNS_PER_TPG; i++) { + deve = &nacl->device_list[i]; + + if (!(deve->lun_flags & TRANSPORT_LUNFLAGS_INITIATOR_ACCESS)) + continue; + + if (!deve->se_lun) { + printk(KERN_ERR "%s device entries device pointer is" + " NULL, but Initiator has access.\n", + TPG_TFO(tpg)->get_fabric_name()); + continue; + } + lun = deve->se_lun; + + spin_unlock_irq(&nacl->device_list_lock); + core_update_device_list_for_node(lun, NULL, deve->mapped_lun, + TRANSPORT_LUNFLAGS_NO_ACCESS, nacl, tpg, 0); + spin_lock_irq(&nacl->device_list_lock); + } + spin_unlock_irq(&nacl->device_list_lock); + + kfree(nacl->device_list); + nacl->device_list = NULL; + + return 0; +} + +void core_dec_lacl_count(struct se_node_acl *se_nacl, struct se_cmd *se_cmd) +{ + struct se_dev_entry *deve; + + spin_lock_irq(&se_nacl->device_list_lock); + deve = &se_nacl->device_list[se_cmd->orig_fe_lun]; + deve->deve_cmds--; + spin_unlock_irq(&se_nacl->device_list_lock); + + return; +} + +void core_update_device_list_access( + u32 mapped_lun, + u32 lun_access, + struct se_node_acl *nacl) +{ + struct se_dev_entry *deve; + + spin_lock_irq(&nacl->device_list_lock); + deve = &nacl->device_list[mapped_lun]; + if (lun_access & TRANSPORT_LUNFLAGS_READ_WRITE) { + deve->lun_flags &= ~TRANSPORT_LUNFLAGS_READ_ONLY; + deve->lun_flags |= TRANSPORT_LUNFLAGS_READ_WRITE; + } else { + deve->lun_flags &= ~TRANSPORT_LUNFLAGS_READ_WRITE; + deve->lun_flags |= TRANSPORT_LUNFLAGS_READ_ONLY; + } + spin_unlock_irq(&nacl->device_list_lock); + + return; +} + +/* core_update_device_list_for_node(): + * + * + */ +int core_update_device_list_for_node( + struct se_lun *lun, + struct se_lun_acl *lun_acl, + u32 mapped_lun, + u32 lun_access, + struct se_node_acl *nacl, + struct se_portal_group *tpg, + int enable) +{ + struct se_port *port = lun->lun_sep; + struct se_dev_entry *deve = &nacl->device_list[mapped_lun]; + int trans = 0; + /* + * If the MappedLUN entry is being disabled, the entry in + * port->sep_alua_list must be removed now before clearing the + * struct se_dev_entry pointers below as logic in + * core_alua_do_transition_tg_pt() depends on these being present. + */ + if (!(enable)) { + /* + * deve->se_lun_acl will be NULL for demo-mode created LUNs + * that have not been explictly concerted to MappedLUNs -> + * struct se_lun_acl. + */ + if (!(deve->se_lun_acl)) + return 0; + + spin_lock_bh(&port->sep_alua_lock); + list_del(&deve->alua_port_list); + spin_unlock_bh(&port->sep_alua_lock); + } + + spin_lock_irq(&nacl->device_list_lock); + if (enable) { + /* + * Check if the call is handling demo mode -> explict LUN ACL + * transition. This transition must be for the same struct se_lun + * + mapped_lun that was setup in demo mode.. + */ + if (deve->lun_flags & TRANSPORT_LUNFLAGS_INITIATOR_ACCESS) { + if (deve->se_lun_acl != NULL) { + printk(KERN_ERR "struct se_dev_entry->se_lun_acl" + " already set for demo mode -> explict" + " LUN ACL transition\n"); + return -1; + } + if (deve->se_lun != lun) { + printk(KERN_ERR "struct se_dev_entry->se_lun does" + " match passed struct se_lun for demo mode" + " -> explict LUN ACL transition\n"); + return -1; + } + deve->se_lun_acl = lun_acl; + trans = 1; + } else { + deve->se_lun = lun; + deve->se_lun_acl = lun_acl; + deve->mapped_lun = mapped_lun; + deve->lun_flags |= TRANSPORT_LUNFLAGS_INITIATOR_ACCESS; + } + + if (lun_access & TRANSPORT_LUNFLAGS_READ_WRITE) { + deve->lun_flags &= ~TRANSPORT_LUNFLAGS_READ_ONLY; + deve->lun_flags |= TRANSPORT_LUNFLAGS_READ_WRITE; + } else { + deve->lun_flags &= ~TRANSPORT_LUNFLAGS_READ_WRITE; + deve->lun_flags |= TRANSPORT_LUNFLAGS_READ_ONLY; + } + + if (trans) { + spin_unlock_irq(&nacl->device_list_lock); + return 0; + } + deve->creation_time = get_jiffies_64(); + deve->attach_count++; + spin_unlock_irq(&nacl->device_list_lock); + + spin_lock_bh(&port->sep_alua_lock); + list_add_tail(&deve->alua_port_list, &port->sep_alua_list); + spin_unlock_bh(&port->sep_alua_lock); + + return 0; + } + /* + * Wait for any in process SPEC_I_PT=1 or REGISTER_AND_MOVE + * PR operation to complete. + */ + spin_unlock_irq(&nacl->device_list_lock); + while (atomic_read(&deve->pr_ref_count) != 0) + cpu_relax(); + spin_lock_irq(&nacl->device_list_lock); + /* + * Disable struct se_dev_entry LUN ACL mapping + */ + core_scsi3_ua_release_all(deve); + deve->se_lun = NULL; + deve->se_lun_acl = NULL; + deve->lun_flags = 0; + deve->creation_time = 0; + deve->attach_count--; + spin_unlock_irq(&nacl->device_list_lock); + + core_scsi3_free_pr_reg_from_nacl(lun->lun_se_dev, nacl); + return 0; +} + +/* core_clear_lun_from_tpg(): + * + * + */ +void core_clear_lun_from_tpg(struct se_lun *lun, struct se_portal_group *tpg) +{ + struct se_node_acl *nacl; + struct se_dev_entry *deve; + u32 i; + + spin_lock_bh(&tpg->acl_node_lock); + list_for_each_entry(nacl, &tpg->acl_node_list, acl_list) { + spin_unlock_bh(&tpg->acl_node_lock); + + spin_lock_irq(&nacl->device_list_lock); + for (i = 0; i < TRANSPORT_MAX_LUNS_PER_TPG; i++) { + deve = &nacl->device_list[i]; + if (lun != deve->se_lun) + continue; + spin_unlock_irq(&nacl->device_list_lock); + + core_update_device_list_for_node(lun, NULL, + deve->mapped_lun, TRANSPORT_LUNFLAGS_NO_ACCESS, + nacl, tpg, 0); + + spin_lock_irq(&nacl->device_list_lock); + } + spin_unlock_irq(&nacl->device_list_lock); + + spin_lock_bh(&tpg->acl_node_lock); + } + spin_unlock_bh(&tpg->acl_node_lock); + + return; +} + +static struct se_port *core_alloc_port(struct se_device *dev) +{ + struct se_port *port, *port_tmp; + + port = kzalloc(sizeof(struct se_port), GFP_KERNEL); + if (!(port)) { + printk(KERN_ERR "Unable to allocate struct se_port\n"); + return NULL; + } + INIT_LIST_HEAD(&port->sep_alua_list); + INIT_LIST_HEAD(&port->sep_list); + atomic_set(&port->sep_tg_pt_secondary_offline, 0); + spin_lock_init(&port->sep_alua_lock); + mutex_init(&port->sep_tg_pt_md_mutex); + + spin_lock(&dev->se_port_lock); + if (dev->dev_port_count == 0x0000ffff) { + printk(KERN_WARNING "Reached dev->dev_port_count ==" + " 0x0000ffff\n"); + spin_unlock(&dev->se_port_lock); + return NULL; + } +again: + /* + * Allocate the next RELATIVE TARGET PORT IDENTIFER for this struct se_device + * Here is the table from spc4r17 section 7.7.3.8. + * + * Table 473 -- RELATIVE TARGET PORT IDENTIFIER field + * + * Code Description + * 0h Reserved + * 1h Relative port 1, historically known as port A + * 2h Relative port 2, historically known as port B + * 3h to FFFFh Relative port 3 through 65 535 + */ + port->sep_rtpi = dev->dev_rpti_counter++; + if (!(port->sep_rtpi)) + goto again; + + list_for_each_entry(port_tmp, &dev->dev_sep_list, sep_list) { + /* + * Make sure RELATIVE TARGET PORT IDENTIFER is unique + * for 16-bit wrap.. + */ + if (port->sep_rtpi == port_tmp->sep_rtpi) + goto again; + } + spin_unlock(&dev->se_port_lock); + + return port; +} + +static void core_export_port( + struct se_device *dev, + struct se_portal_group *tpg, + struct se_port *port, + struct se_lun *lun) +{ + struct se_subsystem_dev *su_dev = SU_DEV(dev); + struct t10_alua_tg_pt_gp_member *tg_pt_gp_mem = NULL; + + spin_lock(&dev->se_port_lock); + spin_lock(&lun->lun_sep_lock); + port->sep_tpg = tpg; + port->sep_lun = lun; + lun->lun_sep = port; + spin_unlock(&lun->lun_sep_lock); + + list_add_tail(&port->sep_list, &dev->dev_sep_list); + spin_unlock(&dev->se_port_lock); + + if (T10_ALUA(su_dev)->alua_type == SPC3_ALUA_EMULATED) { + tg_pt_gp_mem = core_alua_allocate_tg_pt_gp_mem(port); + if (IS_ERR(tg_pt_gp_mem) || !tg_pt_gp_mem) { + printk(KERN_ERR "Unable to allocate t10_alua_tg_pt" + "_gp_member_t\n"); + return; + } + spin_lock(&tg_pt_gp_mem->tg_pt_gp_mem_lock); + __core_alua_attach_tg_pt_gp_mem(tg_pt_gp_mem, + T10_ALUA(su_dev)->default_tg_pt_gp); + spin_unlock(&tg_pt_gp_mem->tg_pt_gp_mem_lock); + printk(KERN_INFO "%s/%s: Adding to default ALUA Target Port" + " Group: alua/default_tg_pt_gp\n", + TRANSPORT(dev)->name, TPG_TFO(tpg)->get_fabric_name()); + } + + dev->dev_port_count++; + port->sep_index = port->sep_rtpi; /* RELATIVE TARGET PORT IDENTIFER */ +} + +/* + * Called with struct se_device->se_port_lock spinlock held. + */ +static void core_release_port(struct se_device *dev, struct se_port *port) +{ + /* + * Wait for any port reference for PR ALL_TG_PT=1 operation + * to complete in __core_scsi3_alloc_registration() + */ + spin_unlock(&dev->se_port_lock); + if (atomic_read(&port->sep_tg_pt_ref_cnt)) + cpu_relax(); + spin_lock(&dev->se_port_lock); + + core_alua_free_tg_pt_gp_mem(port); + + list_del(&port->sep_list); + dev->dev_port_count--; + kfree(port); + + return; +} + +int core_dev_export( + struct se_device *dev, + struct se_portal_group *tpg, + struct se_lun *lun) +{ + struct se_port *port; + + port = core_alloc_port(dev); + if (!(port)) + return -1; + + lun->lun_se_dev = dev; + se_dev_start(dev); + + atomic_inc(&dev->dev_export_obj.obj_access_count); + core_export_port(dev, tpg, port, lun); + return 0; +} + +void core_dev_unexport( + struct se_device *dev, + struct se_portal_group *tpg, + struct se_lun *lun) +{ + struct se_port *port = lun->lun_sep; + + spin_lock(&lun->lun_sep_lock); + if (lun->lun_se_dev == NULL) { + spin_unlock(&lun->lun_sep_lock); + return; + } + spin_unlock(&lun->lun_sep_lock); + + spin_lock(&dev->se_port_lock); + atomic_dec(&dev->dev_export_obj.obj_access_count); + core_release_port(dev, port); + spin_unlock(&dev->se_port_lock); + + se_dev_stop(dev); + lun->lun_se_dev = NULL; +} + +int transport_core_report_lun_response(struct se_cmd *se_cmd) +{ + struct se_dev_entry *deve; + struct se_lun *se_lun; + struct se_session *se_sess = SE_SESS(se_cmd); + struct se_task *se_task; + unsigned char *buf = (unsigned char *)T_TASK(se_cmd)->t_task_buf; + u32 cdb_offset = 0, lun_count = 0, offset = 8; + u64 i, lun; + + list_for_each_entry(se_task, &T_TASK(se_cmd)->t_task_list, t_list) + break; + + if (!(se_task)) { + printk(KERN_ERR "Unable to locate struct se_task for struct se_cmd\n"); + return PYX_TRANSPORT_LU_COMM_FAILURE; + } + + /* + * If no struct se_session pointer is present, this struct se_cmd is + * coming via a target_core_mod PASSTHROUGH op, and not through + * a $FABRIC_MOD. In that case, report LUN=0 only. + */ + if (!(se_sess)) { + lun = 0; + buf[offset++] = ((lun >> 56) & 0xff); + buf[offset++] = ((lun >> 48) & 0xff); + buf[offset++] = ((lun >> 40) & 0xff); + buf[offset++] = ((lun >> 32) & 0xff); + buf[offset++] = ((lun >> 24) & 0xff); + buf[offset++] = ((lun >> 16) & 0xff); + buf[offset++] = ((lun >> 8) & 0xff); + buf[offset++] = (lun & 0xff); + lun_count = 1; + goto done; + } + + spin_lock_irq(&SE_NODE_ACL(se_sess)->device_list_lock); + for (i = 0; i < TRANSPORT_MAX_LUNS_PER_TPG; i++) { + deve = &SE_NODE_ACL(se_sess)->device_list[i]; + if (!(deve->lun_flags & TRANSPORT_LUNFLAGS_INITIATOR_ACCESS)) + continue; + se_lun = deve->se_lun; + /* + * We determine the correct LUN LIST LENGTH even once we + * have reached the initial allocation length. + * See SPC2-R20 7.19. + */ + lun_count++; + if ((cdb_offset + 8) >= se_cmd->data_length) + continue; + + lun = cpu_to_be64(CMD_TFO(se_cmd)->pack_lun(deve->mapped_lun)); + buf[offset++] = ((lun >> 56) & 0xff); + buf[offset++] = ((lun >> 48) & 0xff); + buf[offset++] = ((lun >> 40) & 0xff); + buf[offset++] = ((lun >> 32) & 0xff); + buf[offset++] = ((lun >> 24) & 0xff); + buf[offset++] = ((lun >> 16) & 0xff); + buf[offset++] = ((lun >> 8) & 0xff); + buf[offset++] = (lun & 0xff); + cdb_offset += 8; + } + spin_unlock_irq(&SE_NODE_ACL(se_sess)->device_list_lock); + + /* + * See SPC3 r07, page 159. + */ +done: + lun_count *= 8; + buf[0] = ((lun_count >> 24) & 0xff); + buf[1] = ((lun_count >> 16) & 0xff); + buf[2] = ((lun_count >> 8) & 0xff); + buf[3] = (lun_count & 0xff); + + return PYX_TRANSPORT_SENT_TO_TRANSPORT; +} + +/* se_release_device_for_hba(): + * + * + */ +void se_release_device_for_hba(struct se_device *dev) +{ + struct se_hba *hba = dev->se_hba; + + if ((dev->dev_status & TRANSPORT_DEVICE_ACTIVATED) || + (dev->dev_status & TRANSPORT_DEVICE_DEACTIVATED) || + (dev->dev_status & TRANSPORT_DEVICE_SHUTDOWN) || + (dev->dev_status & TRANSPORT_DEVICE_OFFLINE_ACTIVATED) || + (dev->dev_status & TRANSPORT_DEVICE_OFFLINE_DEACTIVATED)) + se_dev_stop(dev); + + if (dev->dev_ptr) { + kthread_stop(dev->process_thread); + if (dev->transport->free_device) + dev->transport->free_device(dev->dev_ptr); + } + + spin_lock(&hba->device_lock); + list_del(&dev->dev_list); + hba->dev_count--; + spin_unlock(&hba->device_lock); + + core_scsi3_free_all_registrations(dev); + se_release_vpd_for_dev(dev); + + kfree(dev->dev_status_queue_obj); + kfree(dev->dev_queue_obj); + kfree(dev); + + return; +} + +void se_release_vpd_for_dev(struct se_device *dev) +{ + struct t10_vpd *vpd, *vpd_tmp; + + spin_lock(&DEV_T10_WWN(dev)->t10_vpd_lock); + list_for_each_entry_safe(vpd, vpd_tmp, + &DEV_T10_WWN(dev)->t10_vpd_list, vpd_list) { + list_del(&vpd->vpd_list); + kfree(vpd); + } + spin_unlock(&DEV_T10_WWN(dev)->t10_vpd_lock); + + return; +} + +/* + * Called with struct se_hba->device_lock held. + */ +void se_clear_dev_ports(struct se_device *dev) +{ + struct se_hba *hba = dev->se_hba; + struct se_lun *lun; + struct se_portal_group *tpg; + struct se_port *sep, *sep_tmp; + + spin_lock(&dev->se_port_lock); + list_for_each_entry_safe(sep, sep_tmp, &dev->dev_sep_list, sep_list) { + spin_unlock(&dev->se_port_lock); + spin_unlock(&hba->device_lock); + + lun = sep->sep_lun; + tpg = sep->sep_tpg; + spin_lock(&lun->lun_sep_lock); + if (lun->lun_se_dev == NULL) { + spin_unlock(&lun->lun_sep_lock); + continue; + } + spin_unlock(&lun->lun_sep_lock); + + core_dev_del_lun(tpg, lun->unpacked_lun); + + spin_lock(&hba->device_lock); + spin_lock(&dev->se_port_lock); + } + spin_unlock(&dev->se_port_lock); + + return; +} + +/* se_free_virtual_device(): + * + * Used for IBLOCK, RAMDISK, and FILEIO Transport Drivers. + */ +int se_free_virtual_device(struct se_device *dev, struct se_hba *hba) +{ + spin_lock(&hba->device_lock); + se_clear_dev_ports(dev); + spin_unlock(&hba->device_lock); + + core_alua_free_lu_gp_mem(dev); + se_release_device_for_hba(dev); + + return 0; +} + +static void se_dev_start(struct se_device *dev) +{ + struct se_hba *hba = dev->se_hba; + + spin_lock(&hba->device_lock); + atomic_inc(&dev->dev_obj.obj_access_count); + if (atomic_read(&dev->dev_obj.obj_access_count) == 1) { + if (dev->dev_status & TRANSPORT_DEVICE_DEACTIVATED) { + dev->dev_status &= ~TRANSPORT_DEVICE_DEACTIVATED; + dev->dev_status |= TRANSPORT_DEVICE_ACTIVATED; + } else if (dev->dev_status & + TRANSPORT_DEVICE_OFFLINE_DEACTIVATED) { + dev->dev_status &= + ~TRANSPORT_DEVICE_OFFLINE_DEACTIVATED; + dev->dev_status |= TRANSPORT_DEVICE_OFFLINE_ACTIVATED; + } + } + spin_unlock(&hba->device_lock); +} + +static void se_dev_stop(struct se_device *dev) +{ + struct se_hba *hba = dev->se_hba; + + spin_lock(&hba->device_lock); + atomic_dec(&dev->dev_obj.obj_access_count); + if (atomic_read(&dev->dev_obj.obj_access_count) == 0) { + if (dev->dev_status & TRANSPORT_DEVICE_ACTIVATED) { + dev->dev_status &= ~TRANSPORT_DEVICE_ACTIVATED; + dev->dev_status |= TRANSPORT_DEVICE_DEACTIVATED; + } else if (dev->dev_status & + TRANSPORT_DEVICE_OFFLINE_ACTIVATED) { + dev->dev_status &= ~TRANSPORT_DEVICE_OFFLINE_ACTIVATED; + dev->dev_status |= TRANSPORT_DEVICE_OFFLINE_DEACTIVATED; + } + } + spin_unlock(&hba->device_lock); + + while (atomic_read(&hba->dev_mib_access_count)) + cpu_relax(); +} + +int se_dev_check_online(struct se_device *dev) +{ + int ret; + + spin_lock_irq(&dev->dev_status_lock); + ret = ((dev->dev_status & TRANSPORT_DEVICE_ACTIVATED) || + (dev->dev_status & TRANSPORT_DEVICE_DEACTIVATED)) ? 0 : 1; + spin_unlock_irq(&dev->dev_status_lock); + + return ret; +} + +int se_dev_check_shutdown(struct se_device *dev) +{ + int ret; + + spin_lock_irq(&dev->dev_status_lock); + ret = (dev->dev_status & TRANSPORT_DEVICE_SHUTDOWN); + spin_unlock_irq(&dev->dev_status_lock); + + return ret; +} + +void se_dev_set_default_attribs( + struct se_device *dev, + struct se_dev_limits *dev_limits) +{ + struct queue_limits *limits = &dev_limits->limits; + + DEV_ATTRIB(dev)->emulate_dpo = DA_EMULATE_DPO; + DEV_ATTRIB(dev)->emulate_fua_write = DA_EMULATE_FUA_WRITE; + DEV_ATTRIB(dev)->emulate_fua_read = DA_EMULATE_FUA_READ; + DEV_ATTRIB(dev)->emulate_write_cache = DA_EMULATE_WRITE_CACHE; + DEV_ATTRIB(dev)->emulate_ua_intlck_ctrl = DA_EMULATE_UA_INTLLCK_CTRL; + DEV_ATTRIB(dev)->emulate_tas = DA_EMULATE_TAS; + DEV_ATTRIB(dev)->emulate_tpu = DA_EMULATE_TPU; + DEV_ATTRIB(dev)->emulate_tpws = DA_EMULATE_TPWS; + DEV_ATTRIB(dev)->emulate_reservations = DA_EMULATE_RESERVATIONS; + DEV_ATTRIB(dev)->emulate_alua = DA_EMULATE_ALUA; + DEV_ATTRIB(dev)->enforce_pr_isids = DA_ENFORCE_PR_ISIDS; + /* + * The TPU=1 and TPWS=1 settings will be set in TCM/IBLOCK + * iblock_create_virtdevice() from struct queue_limits values + * if blk_queue_discard()==1 + */ + DEV_ATTRIB(dev)->max_unmap_lba_count = DA_MAX_UNMAP_LBA_COUNT; + DEV_ATTRIB(dev)->max_unmap_block_desc_count = + DA_MAX_UNMAP_BLOCK_DESC_COUNT; + DEV_ATTRIB(dev)->unmap_granularity = DA_UNMAP_GRANULARITY_DEFAULT; + DEV_ATTRIB(dev)->unmap_granularity_alignment = + DA_UNMAP_GRANULARITY_ALIGNMENT_DEFAULT; + /* + * block_size is based on subsystem plugin dependent requirements. + */ + DEV_ATTRIB(dev)->hw_block_size = limits->logical_block_size; + DEV_ATTRIB(dev)->block_size = limits->logical_block_size; + /* + * max_sectors is based on subsystem plugin dependent requirements. + */ + DEV_ATTRIB(dev)->hw_max_sectors = limits->max_hw_sectors; + DEV_ATTRIB(dev)->max_sectors = limits->max_sectors; + /* + * Set optimal_sectors from max_sectors, which can be lowered via + * configfs. + */ + DEV_ATTRIB(dev)->optimal_sectors = limits->max_sectors; + /* + * queue_depth is based on subsystem plugin dependent requirements. + */ + DEV_ATTRIB(dev)->hw_queue_depth = dev_limits->hw_queue_depth; + DEV_ATTRIB(dev)->queue_depth = dev_limits->queue_depth; +} + +int se_dev_set_task_timeout(struct se_device *dev, u32 task_timeout) +{ + if (task_timeout > DA_TASK_TIMEOUT_MAX) { + printk(KERN_ERR "dev[%p]: Passed task_timeout: %u larger then" + " DA_TASK_TIMEOUT_MAX\n", dev, task_timeout); + return -1; + } else { + DEV_ATTRIB(dev)->task_timeout = task_timeout; + printk(KERN_INFO "dev[%p]: Set SE Device task_timeout: %u\n", + dev, task_timeout); + } + + return 0; +} + +int se_dev_set_max_unmap_lba_count( + struct se_device *dev, + u32 max_unmap_lba_count) +{ + DEV_ATTRIB(dev)->max_unmap_lba_count = max_unmap_lba_count; + printk(KERN_INFO "dev[%p]: Set max_unmap_lba_count: %u\n", + dev, DEV_ATTRIB(dev)->max_unmap_lba_count); + return 0; +} + +int se_dev_set_max_unmap_block_desc_count( + struct se_device *dev, + u32 max_unmap_block_desc_count) +{ + DEV_ATTRIB(dev)->max_unmap_block_desc_count = max_unmap_block_desc_count; + printk(KERN_INFO "dev[%p]: Set max_unmap_block_desc_count: %u\n", + dev, DEV_ATTRIB(dev)->max_unmap_block_desc_count); + return 0; +} + +int se_dev_set_unmap_granularity( + struct se_device *dev, + u32 unmap_granularity) +{ + DEV_ATTRIB(dev)->unmap_granularity = unmap_granularity; + printk(KERN_INFO "dev[%p]: Set unmap_granularity: %u\n", + dev, DEV_ATTRIB(dev)->unmap_granularity); + return 0; +} + +int se_dev_set_unmap_granularity_alignment( + struct se_device *dev, + u32 unmap_granularity_alignment) +{ + DEV_ATTRIB(dev)->unmap_granularity_alignment = unmap_granularity_alignment; + printk(KERN_INFO "dev[%p]: Set unmap_granularity_alignment: %u\n", + dev, DEV_ATTRIB(dev)->unmap_granularity_alignment); + return 0; +} + +int se_dev_set_emulate_dpo(struct se_device *dev, int flag) +{ + if ((flag != 0) && (flag != 1)) { + printk(KERN_ERR "Illegal value %d\n", flag); + return -1; + } + if (TRANSPORT(dev)->dpo_emulated == NULL) { + printk(KERN_ERR "TRANSPORT(dev)->dpo_emulated is NULL\n"); + return -1; + } + if (TRANSPORT(dev)->dpo_emulated(dev) == 0) { + printk(KERN_ERR "TRANSPORT(dev)->dpo_emulated not supported\n"); + return -1; + } + DEV_ATTRIB(dev)->emulate_dpo = flag; + printk(KERN_INFO "dev[%p]: SE Device Page Out (DPO) Emulation" + " bit: %d\n", dev, DEV_ATTRIB(dev)->emulate_dpo); + return 0; +} + +int se_dev_set_emulate_fua_write(struct se_device *dev, int flag) +{ + if ((flag != 0) && (flag != 1)) { + printk(KERN_ERR "Illegal value %d\n", flag); + return -1; + } + if (TRANSPORT(dev)->fua_write_emulated == NULL) { + printk(KERN_ERR "TRANSPORT(dev)->fua_write_emulated is NULL\n"); + return -1; + } + if (TRANSPORT(dev)->fua_write_emulated(dev) == 0) { + printk(KERN_ERR "TRANSPORT(dev)->fua_write_emulated not supported\n"); + return -1; + } + DEV_ATTRIB(dev)->emulate_fua_write = flag; + printk(KERN_INFO "dev[%p]: SE Device Forced Unit Access WRITEs: %d\n", + dev, DEV_ATTRIB(dev)->emulate_fua_write); + return 0; +} + +int se_dev_set_emulate_fua_read(struct se_device *dev, int flag) +{ + if ((flag != 0) && (flag != 1)) { + printk(KERN_ERR "Illegal value %d\n", flag); + return -1; + } + if (TRANSPORT(dev)->fua_read_emulated == NULL) { + printk(KERN_ERR "TRANSPORT(dev)->fua_read_emulated is NULL\n"); + return -1; + } + if (TRANSPORT(dev)->fua_read_emulated(dev) == 0) { + printk(KERN_ERR "TRANSPORT(dev)->fua_read_emulated not supported\n"); + return -1; + } + DEV_ATTRIB(dev)->emulate_fua_read = flag; + printk(KERN_INFO "dev[%p]: SE Device Forced Unit Access READs: %d\n", + dev, DEV_ATTRIB(dev)->emulate_fua_read); + return 0; +} + +int se_dev_set_emulate_write_cache(struct se_device *dev, int flag) +{ + if ((flag != 0) && (flag != 1)) { + printk(KERN_ERR "Illegal value %d\n", flag); + return -1; + } + if (TRANSPORT(dev)->write_cache_emulated == NULL) { + printk(KERN_ERR "TRANSPORT(dev)->write_cache_emulated is NULL\n"); + return -1; + } + if (TRANSPORT(dev)->write_cache_emulated(dev) == 0) { + printk(KERN_ERR "TRANSPORT(dev)->write_cache_emulated not supported\n"); + return -1; + } + DEV_ATTRIB(dev)->emulate_write_cache = flag; + printk(KERN_INFO "dev[%p]: SE Device WRITE_CACHE_EMULATION flag: %d\n", + dev, DEV_ATTRIB(dev)->emulate_write_cache); + return 0; +} + +int se_dev_set_emulate_ua_intlck_ctrl(struct se_device *dev, int flag) +{ + if ((flag != 0) && (flag != 1) && (flag != 2)) { + printk(KERN_ERR "Illegal value %d\n", flag); + return -1; + } + + if (atomic_read(&dev->dev_export_obj.obj_access_count)) { + printk(KERN_ERR "dev[%p]: Unable to change SE Device" + " UA_INTRLCK_CTRL while dev_export_obj: %d count" + " exists\n", dev, + atomic_read(&dev->dev_export_obj.obj_access_count)); + return -1; + } + DEV_ATTRIB(dev)->emulate_ua_intlck_ctrl = flag; + printk(KERN_INFO "dev[%p]: SE Device UA_INTRLCK_CTRL flag: %d\n", + dev, DEV_ATTRIB(dev)->emulate_ua_intlck_ctrl); + + return 0; +} + +int se_dev_set_emulate_tas(struct se_device *dev, int flag) +{ + if ((flag != 0) && (flag != 1)) { + printk(KERN_ERR "Illegal value %d\n", flag); + return -1; + } + + if (atomic_read(&dev->dev_export_obj.obj_access_count)) { + printk(KERN_ERR "dev[%p]: Unable to change SE Device TAS while" + " dev_export_obj: %d count exists\n", dev, + atomic_read(&dev->dev_export_obj.obj_access_count)); + return -1; + } + DEV_ATTRIB(dev)->emulate_tas = flag; + printk(KERN_INFO "dev[%p]: SE Device TASK_ABORTED status bit: %s\n", + dev, (DEV_ATTRIB(dev)->emulate_tas) ? "Enabled" : "Disabled"); + + return 0; +} + +int se_dev_set_emulate_tpu(struct se_device *dev, int flag) +{ + if ((flag != 0) && (flag != 1)) { + printk(KERN_ERR "Illegal value %d\n", flag); + return -1; + } + /* + * We expect this value to be non-zero when generic Block Layer + * Discard supported is detected iblock_create_virtdevice(). + */ + if (!(DEV_ATTRIB(dev)->max_unmap_block_desc_count)) { + printk(KERN_ERR "Generic Block Discard not supported\n"); + return -ENOSYS; + } + + DEV_ATTRIB(dev)->emulate_tpu = flag; + printk(KERN_INFO "dev[%p]: SE Device Thin Provisioning UNMAP bit: %d\n", + dev, flag); + return 0; +} + +int se_dev_set_emulate_tpws(struct se_device *dev, int flag) +{ + if ((flag != 0) && (flag != 1)) { + printk(KERN_ERR "Illegal value %d\n", flag); + return -1; + } + /* + * We expect this value to be non-zero when generic Block Layer + * Discard supported is detected iblock_create_virtdevice(). + */ + if (!(DEV_ATTRIB(dev)->max_unmap_block_desc_count)) { + printk(KERN_ERR "Generic Block Discard not supported\n"); + return -ENOSYS; + } + + DEV_ATTRIB(dev)->emulate_tpws = flag; + printk(KERN_INFO "dev[%p]: SE Device Thin Provisioning WRITE_SAME: %d\n", + dev, flag); + return 0; +} + +int se_dev_set_enforce_pr_isids(struct se_device *dev, int flag) +{ + if ((flag != 0) && (flag != 1)) { + printk(KERN_ERR "Illegal value %d\n", flag); + return -1; + } + DEV_ATTRIB(dev)->enforce_pr_isids = flag; + printk(KERN_INFO "dev[%p]: SE Device enforce_pr_isids bit: %s\n", dev, + (DEV_ATTRIB(dev)->enforce_pr_isids) ? "Enabled" : "Disabled"); + return 0; +} + +/* + * Note, this can only be called on unexported SE Device Object. + */ +int se_dev_set_queue_depth(struct se_device *dev, u32 queue_depth) +{ + u32 orig_queue_depth = dev->queue_depth; + + if (atomic_read(&dev->dev_export_obj.obj_access_count)) { + printk(KERN_ERR "dev[%p]: Unable to change SE Device TCQ while" + " dev_export_obj: %d count exists\n", dev, + atomic_read(&dev->dev_export_obj.obj_access_count)); + return -1; + } + if (!(queue_depth)) { + printk(KERN_ERR "dev[%p]: Illegal ZERO value for queue" + "_depth\n", dev); + return -1; + } + + if (TRANSPORT(dev)->transport_type == TRANSPORT_PLUGIN_PHBA_PDEV) { + if (queue_depth > DEV_ATTRIB(dev)->hw_queue_depth) { + printk(KERN_ERR "dev[%p]: Passed queue_depth: %u" + " exceeds TCM/SE_Device TCQ: %u\n", + dev, queue_depth, + DEV_ATTRIB(dev)->hw_queue_depth); + return -1; + } + } else { + if (queue_depth > DEV_ATTRIB(dev)->queue_depth) { + if (queue_depth > DEV_ATTRIB(dev)->hw_queue_depth) { + printk(KERN_ERR "dev[%p]: Passed queue_depth:" + " %u exceeds TCM/SE_Device MAX" + " TCQ: %u\n", dev, queue_depth, + DEV_ATTRIB(dev)->hw_queue_depth); + return -1; + } + } + } + + DEV_ATTRIB(dev)->queue_depth = dev->queue_depth = queue_depth; + if (queue_depth > orig_queue_depth) + atomic_add(queue_depth - orig_queue_depth, &dev->depth_left); + else if (queue_depth < orig_queue_depth) + atomic_sub(orig_queue_depth - queue_depth, &dev->depth_left); + + printk(KERN_INFO "dev[%p]: SE Device TCQ Depth changed to: %u\n", + dev, queue_depth); + return 0; +} + +int se_dev_set_max_sectors(struct se_device *dev, u32 max_sectors) +{ + int force = 0; /* Force setting for VDEVS */ + + if (atomic_read(&dev->dev_export_obj.obj_access_count)) { + printk(KERN_ERR "dev[%p]: Unable to change SE Device" + " max_sectors while dev_export_obj: %d count exists\n", + dev, atomic_read(&dev->dev_export_obj.obj_access_count)); + return -1; + } + if (!(max_sectors)) { + printk(KERN_ERR "dev[%p]: Illegal ZERO value for" + " max_sectors\n", dev); + return -1; + } + if (max_sectors < DA_STATUS_MAX_SECTORS_MIN) { + printk(KERN_ERR "dev[%p]: Passed max_sectors: %u less than" + " DA_STATUS_MAX_SECTORS_MIN: %u\n", dev, max_sectors, + DA_STATUS_MAX_SECTORS_MIN); + return -1; + } + if (TRANSPORT(dev)->transport_type == TRANSPORT_PLUGIN_PHBA_PDEV) { + if (max_sectors > DEV_ATTRIB(dev)->hw_max_sectors) { + printk(KERN_ERR "dev[%p]: Passed max_sectors: %u" + " greater than TCM/SE_Device max_sectors:" + " %u\n", dev, max_sectors, + DEV_ATTRIB(dev)->hw_max_sectors); + return -1; + } + } else { + if (!(force) && (max_sectors > + DEV_ATTRIB(dev)->hw_max_sectors)) { + printk(KERN_ERR "dev[%p]: Passed max_sectors: %u" + " greater than TCM/SE_Device max_sectors" + ": %u, use force=1 to override.\n", dev, + max_sectors, DEV_ATTRIB(dev)->hw_max_sectors); + return -1; + } + if (max_sectors > DA_STATUS_MAX_SECTORS_MAX) { + printk(KERN_ERR "dev[%p]: Passed max_sectors: %u" + " greater than DA_STATUS_MAX_SECTORS_MAX:" + " %u\n", dev, max_sectors, + DA_STATUS_MAX_SECTORS_MAX); + return -1; + } + } + + DEV_ATTRIB(dev)->max_sectors = max_sectors; + printk("dev[%p]: SE Device max_sectors changed to %u\n", + dev, max_sectors); + return 0; +} + +int se_dev_set_optimal_sectors(struct se_device *dev, u32 optimal_sectors) +{ + if (atomic_read(&dev->dev_export_obj.obj_access_count)) { + printk(KERN_ERR "dev[%p]: Unable to change SE Device" + " optimal_sectors while dev_export_obj: %d count exists\n", + dev, atomic_read(&dev->dev_export_obj.obj_access_count)); + return -EINVAL; + } + if (TRANSPORT(dev)->transport_type == TRANSPORT_PLUGIN_PHBA_PDEV) { + printk(KERN_ERR "dev[%p]: Passed optimal_sectors cannot be" + " changed for TCM/pSCSI\n", dev); + return -EINVAL; + } + if (optimal_sectors > DEV_ATTRIB(dev)->max_sectors) { + printk(KERN_ERR "dev[%p]: Passed optimal_sectors %u cannot be" + " greater than max_sectors: %u\n", dev, + optimal_sectors, DEV_ATTRIB(dev)->max_sectors); + return -EINVAL; + } + + DEV_ATTRIB(dev)->optimal_sectors = optimal_sectors; + printk(KERN_INFO "dev[%p]: SE Device optimal_sectors changed to %u\n", + dev, optimal_sectors); + return 0; +} + +int se_dev_set_block_size(struct se_device *dev, u32 block_size) +{ + if (atomic_read(&dev->dev_export_obj.obj_access_count)) { + printk(KERN_ERR "dev[%p]: Unable to change SE Device block_size" + " while dev_export_obj: %d count exists\n", dev, + atomic_read(&dev->dev_export_obj.obj_access_count)); + return -1; + } + + if ((block_size != 512) && + (block_size != 1024) && + (block_size != 2048) && + (block_size != 4096)) { + printk(KERN_ERR "dev[%p]: Illegal value for block_device: %u" + " for SE device, must be 512, 1024, 2048 or 4096\n", + dev, block_size); + return -1; + } + + if (TRANSPORT(dev)->transport_type == TRANSPORT_PLUGIN_PHBA_PDEV) { + printk(KERN_ERR "dev[%p]: Not allowed to change block_size for" + " Physical Device, use for Linux/SCSI to change" + " block_size for underlying hardware\n", dev); + return -1; + } + + DEV_ATTRIB(dev)->block_size = block_size; + printk(KERN_INFO "dev[%p]: SE Device block_size changed to %u\n", + dev, block_size); + return 0; +} + +struct se_lun *core_dev_add_lun( + struct se_portal_group *tpg, + struct se_hba *hba, + struct se_device *dev, + u32 lun) +{ + struct se_lun *lun_p; + u32 lun_access = 0; + + if (atomic_read(&dev->dev_access_obj.obj_access_count) != 0) { + printk(KERN_ERR "Unable to export struct se_device while dev_access_obj: %d\n", + atomic_read(&dev->dev_access_obj.obj_access_count)); + return NULL; + } + + lun_p = core_tpg_pre_addlun(tpg, lun); + if ((IS_ERR(lun_p)) || !(lun_p)) + return NULL; + + if (dev->dev_flags & DF_READ_ONLY) + lun_access = TRANSPORT_LUNFLAGS_READ_ONLY; + else + lun_access = TRANSPORT_LUNFLAGS_READ_WRITE; + + if (core_tpg_post_addlun(tpg, lun_p, lun_access, dev) < 0) + return NULL; + + printk(KERN_INFO "%s_TPG[%u]_LUN[%u] - Activated %s Logical Unit from" + " CORE HBA: %u\n", TPG_TFO(tpg)->get_fabric_name(), + TPG_TFO(tpg)->tpg_get_tag(tpg), lun_p->unpacked_lun, + TPG_TFO(tpg)->get_fabric_name(), hba->hba_id); + /* + * Update LUN maps for dynamically added initiators when + * generate_node_acl is enabled. + */ + if (TPG_TFO(tpg)->tpg_check_demo_mode(tpg)) { + struct se_node_acl *acl; + spin_lock_bh(&tpg->acl_node_lock); + list_for_each_entry(acl, &tpg->acl_node_list, acl_list) { + if (acl->dynamic_node_acl) { + spin_unlock_bh(&tpg->acl_node_lock); + core_tpg_add_node_to_devs(acl, tpg); + spin_lock_bh(&tpg->acl_node_lock); + } + } + spin_unlock_bh(&tpg->acl_node_lock); + } + + return lun_p; +} + +/* core_dev_del_lun(): + * + * + */ +int core_dev_del_lun( + struct se_portal_group *tpg, + u32 unpacked_lun) +{ + struct se_lun *lun; + int ret = 0; + + lun = core_tpg_pre_dellun(tpg, unpacked_lun, &ret); + if (!(lun)) + return ret; + + core_tpg_post_dellun(tpg, lun); + + printk(KERN_INFO "%s_TPG[%u]_LUN[%u] - Deactivated %s Logical Unit from" + " device object\n", TPG_TFO(tpg)->get_fabric_name(), + TPG_TFO(tpg)->tpg_get_tag(tpg), unpacked_lun, + TPG_TFO(tpg)->get_fabric_name()); + + return 0; +} + +struct se_lun *core_get_lun_from_tpg(struct se_portal_group *tpg, u32 unpacked_lun) +{ + struct se_lun *lun; + + spin_lock(&tpg->tpg_lun_lock); + if (unpacked_lun > (TRANSPORT_MAX_LUNS_PER_TPG-1)) { + printk(KERN_ERR "%s LUN: %u exceeds TRANSPORT_MAX_LUNS" + "_PER_TPG-1: %u for Target Portal Group: %hu\n", + TPG_TFO(tpg)->get_fabric_name(), unpacked_lun, + TRANSPORT_MAX_LUNS_PER_TPG-1, + TPG_TFO(tpg)->tpg_get_tag(tpg)); + spin_unlock(&tpg->tpg_lun_lock); + return NULL; + } + lun = &tpg->tpg_lun_list[unpacked_lun]; + + if (lun->lun_status != TRANSPORT_LUN_STATUS_FREE) { + printk(KERN_ERR "%s Logical Unit Number: %u is not free on" + " Target Portal Group: %hu, ignoring request.\n", + TPG_TFO(tpg)->get_fabric_name(), unpacked_lun, + TPG_TFO(tpg)->tpg_get_tag(tpg)); + spin_unlock(&tpg->tpg_lun_lock); + return NULL; + } + spin_unlock(&tpg->tpg_lun_lock); + + return lun; +} + +/* core_dev_get_lun(): + * + * + */ +static struct se_lun *core_dev_get_lun(struct se_portal_group *tpg, u32 unpacked_lun) +{ + struct se_lun *lun; + + spin_lock(&tpg->tpg_lun_lock); + if (unpacked_lun > (TRANSPORT_MAX_LUNS_PER_TPG-1)) { + printk(KERN_ERR "%s LUN: %u exceeds TRANSPORT_MAX_LUNS_PER" + "_TPG-1: %u for Target Portal Group: %hu\n", + TPG_TFO(tpg)->get_fabric_name(), unpacked_lun, + TRANSPORT_MAX_LUNS_PER_TPG-1, + TPG_TFO(tpg)->tpg_get_tag(tpg)); + spin_unlock(&tpg->tpg_lun_lock); + return NULL; + } + lun = &tpg->tpg_lun_list[unpacked_lun]; + + if (lun->lun_status != TRANSPORT_LUN_STATUS_ACTIVE) { + printk(KERN_ERR "%s Logical Unit Number: %u is not active on" + " Target Portal Group: %hu, ignoring request.\n", + TPG_TFO(tpg)->get_fabric_name(), unpacked_lun, + TPG_TFO(tpg)->tpg_get_tag(tpg)); + spin_unlock(&tpg->tpg_lun_lock); + return NULL; + } + spin_unlock(&tpg->tpg_lun_lock); + + return lun; +} + +struct se_lun_acl *core_dev_init_initiator_node_lun_acl( + struct se_portal_group *tpg, + u32 mapped_lun, + char *initiatorname, + int *ret) +{ + struct se_lun_acl *lacl; + struct se_node_acl *nacl; + + if (strlen(initiatorname) > TRANSPORT_IQN_LEN) { + printk(KERN_ERR "%s InitiatorName exceeds maximum size.\n", + TPG_TFO(tpg)->get_fabric_name()); + *ret = -EOVERFLOW; + return NULL; + } + nacl = core_tpg_get_initiator_node_acl(tpg, initiatorname); + if (!(nacl)) { + *ret = -EINVAL; + return NULL; + } + lacl = kzalloc(sizeof(struct se_lun_acl), GFP_KERNEL); + if (!(lacl)) { + printk(KERN_ERR "Unable to allocate memory for struct se_lun_acl.\n"); + *ret = -ENOMEM; + return NULL; + } + + INIT_LIST_HEAD(&lacl->lacl_list); + lacl->mapped_lun = mapped_lun; + lacl->se_lun_nacl = nacl; + snprintf(lacl->initiatorname, TRANSPORT_IQN_LEN, "%s", initiatorname); + + return lacl; +} + +int core_dev_add_initiator_node_lun_acl( + struct se_portal_group *tpg, + struct se_lun_acl *lacl, + u32 unpacked_lun, + u32 lun_access) +{ + struct se_lun *lun; + struct se_node_acl *nacl; + + lun = core_dev_get_lun(tpg, unpacked_lun); + if (!(lun)) { + printk(KERN_ERR "%s Logical Unit Number: %u is not active on" + " Target Portal Group: %hu, ignoring request.\n", + TPG_TFO(tpg)->get_fabric_name(), unpacked_lun, + TPG_TFO(tpg)->tpg_get_tag(tpg)); + return -EINVAL; + } + + nacl = lacl->se_lun_nacl; + if (!(nacl)) + return -EINVAL; + + if ((lun->lun_access & TRANSPORT_LUNFLAGS_READ_ONLY) && + (lun_access & TRANSPORT_LUNFLAGS_READ_WRITE)) + lun_access = TRANSPORT_LUNFLAGS_READ_ONLY; + + lacl->se_lun = lun; + + if (core_update_device_list_for_node(lun, lacl, lacl->mapped_lun, + lun_access, nacl, tpg, 1) < 0) + return -EINVAL; + + spin_lock(&lun->lun_acl_lock); + list_add_tail(&lacl->lacl_list, &lun->lun_acl_list); + atomic_inc(&lun->lun_acl_count); + smp_mb__after_atomic_inc(); + spin_unlock(&lun->lun_acl_lock); + + printk(KERN_INFO "%s_TPG[%hu]_LUN[%u->%u] - Added %s ACL for " + " InitiatorNode: %s\n", TPG_TFO(tpg)->get_fabric_name(), + TPG_TFO(tpg)->tpg_get_tag(tpg), unpacked_lun, lacl->mapped_lun, + (lun_access & TRANSPORT_LUNFLAGS_READ_WRITE) ? "RW" : "RO", + lacl->initiatorname); + /* + * Check to see if there are any existing persistent reservation APTPL + * pre-registrations that need to be enabled for this LUN ACL.. + */ + core_scsi3_check_aptpl_registration(lun->lun_se_dev, tpg, lun, lacl); + return 0; +} + +/* core_dev_del_initiator_node_lun_acl(): + * + * + */ +int core_dev_del_initiator_node_lun_acl( + struct se_portal_group *tpg, + struct se_lun *lun, + struct se_lun_acl *lacl) +{ + struct se_node_acl *nacl; + + nacl = lacl->se_lun_nacl; + if (!(nacl)) + return -EINVAL; + + spin_lock(&lun->lun_acl_lock); + list_del(&lacl->lacl_list); + atomic_dec(&lun->lun_acl_count); + smp_mb__after_atomic_dec(); + spin_unlock(&lun->lun_acl_lock); + + core_update_device_list_for_node(lun, NULL, lacl->mapped_lun, + TRANSPORT_LUNFLAGS_NO_ACCESS, nacl, tpg, 0); + + lacl->se_lun = NULL; + + printk(KERN_INFO "%s_TPG[%hu]_LUN[%u] - Removed ACL for" + " InitiatorNode: %s Mapped LUN: %u\n", + TPG_TFO(tpg)->get_fabric_name(), + TPG_TFO(tpg)->tpg_get_tag(tpg), lun->unpacked_lun, + lacl->initiatorname, lacl->mapped_lun); + + return 0; +} + +void core_dev_free_initiator_node_lun_acl( + struct se_portal_group *tpg, + struct se_lun_acl *lacl) +{ + printk("%s_TPG[%hu] - Freeing ACL for %s InitiatorNode: %s" + " Mapped LUN: %u\n", TPG_TFO(tpg)->get_fabric_name(), + TPG_TFO(tpg)->tpg_get_tag(tpg), + TPG_TFO(tpg)->get_fabric_name(), + lacl->initiatorname, lacl->mapped_lun); + + kfree(lacl); +} + +int core_dev_setup_virtual_lun0(void) +{ + struct se_hba *hba; + struct se_device *dev; + struct se_subsystem_dev *se_dev = NULL; + struct se_subsystem_api *t; + char buf[16]; + int ret; + + hba = core_alloc_hba("rd_dr", 0, HBA_FLAGS_INTERNAL_USE); + if (IS_ERR(hba)) + return PTR_ERR(hba); + + se_global->g_lun0_hba = hba; + t = hba->transport; + + se_dev = kzalloc(sizeof(struct se_subsystem_dev), GFP_KERNEL); + if (!(se_dev)) { + printk(KERN_ERR "Unable to allocate memory for" + " struct se_subsystem_dev\n"); + ret = -ENOMEM; + goto out; + } + INIT_LIST_HEAD(&se_dev->g_se_dev_list); + INIT_LIST_HEAD(&se_dev->t10_wwn.t10_vpd_list); + spin_lock_init(&se_dev->t10_wwn.t10_vpd_lock); + INIT_LIST_HEAD(&se_dev->t10_reservation.registration_list); + INIT_LIST_HEAD(&se_dev->t10_reservation.aptpl_reg_list); + spin_lock_init(&se_dev->t10_reservation.registration_lock); + spin_lock_init(&se_dev->t10_reservation.aptpl_reg_lock); + INIT_LIST_HEAD(&se_dev->t10_alua.tg_pt_gps_list); + spin_lock_init(&se_dev->t10_alua.tg_pt_gps_lock); + spin_lock_init(&se_dev->se_dev_lock); + se_dev->t10_reservation.pr_aptpl_buf_len = PR_APTPL_BUF_LEN; + se_dev->t10_wwn.t10_sub_dev = se_dev; + se_dev->t10_alua.t10_sub_dev = se_dev; + se_dev->se_dev_attrib.da_sub_dev = se_dev; + se_dev->se_dev_hba = hba; + + se_dev->se_dev_su_ptr = t->allocate_virtdevice(hba, "virt_lun0"); + if (!(se_dev->se_dev_su_ptr)) { + printk(KERN_ERR "Unable to locate subsystem dependent pointer" + " from allocate_virtdevice()\n"); + ret = -ENOMEM; + goto out; + } + se_global->g_lun0_su_dev = se_dev; + + memset(buf, 0, 16); + sprintf(buf, "rd_pages=8"); + t->set_configfs_dev_params(hba, se_dev, buf, sizeof(buf)); + + dev = t->create_virtdevice(hba, se_dev, se_dev->se_dev_su_ptr); + if (!(dev) || IS_ERR(dev)) { + ret = -ENOMEM; + goto out; + } + se_dev->se_dev_ptr = dev; + se_global->g_lun0_dev = dev; + + return 0; +out: + se_global->g_lun0_su_dev = NULL; + kfree(se_dev); + if (se_global->g_lun0_hba) { + core_delete_hba(se_global->g_lun0_hba); + se_global->g_lun0_hba = NULL; + } + return ret; +} + + +void core_dev_release_virtual_lun0(void) +{ + struct se_hba *hba = se_global->g_lun0_hba; + struct se_subsystem_dev *su_dev = se_global->g_lun0_su_dev; + + if (!(hba)) + return; + + if (se_global->g_lun0_dev) + se_free_virtual_device(se_global->g_lun0_dev, hba); + + kfree(su_dev); + core_delete_hba(hba); +} diff --git a/drivers/target/target_core_fabric_configfs.c b/drivers/target/target_core_fabric_configfs.c new file mode 100644 index 000000000000..32b148d7e261 --- /dev/null +++ b/drivers/target/target_core_fabric_configfs.c @@ -0,0 +1,996 @@ +/******************************************************************************* +* Filename: target_core_fabric_configfs.c + * + * This file contains generic fabric module configfs infrastructure for + * TCM v4.x code + * + * Copyright (c) 2010 Rising Tide Systems + * Copyright (c) 2010 Linux-iSCSI.org + * + * Copyright (c) 2010 Nicholas A. Bellinger +* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + ****************************************************************************/ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "target_core_alua.h" +#include "target_core_hba.h" +#include "target_core_pr.h" + +#define TF_CIT_SETUP(_name, _item_ops, _group_ops, _attrs) \ +static void target_fabric_setup_##_name##_cit(struct target_fabric_configfs *tf) \ +{ \ + struct target_fabric_configfs_template *tfc = &tf->tf_cit_tmpl; \ + struct config_item_type *cit = &tfc->tfc_##_name##_cit; \ + \ + cit->ct_item_ops = _item_ops; \ + cit->ct_group_ops = _group_ops; \ + cit->ct_attrs = _attrs; \ + cit->ct_owner = tf->tf_module; \ + printk("Setup generic %s\n", __stringify(_name)); \ +} + +/* Start of tfc_tpg_mappedlun_cit */ + +static int target_fabric_mappedlun_link( + struct config_item *lun_acl_ci, + struct config_item *lun_ci) +{ + struct se_dev_entry *deve; + struct se_lun *lun = container_of(to_config_group(lun_ci), + struct se_lun, lun_group); + struct se_lun_acl *lacl = container_of(to_config_group(lun_acl_ci), + struct se_lun_acl, se_lun_group); + struct se_portal_group *se_tpg; + struct config_item *nacl_ci, *tpg_ci, *tpg_ci_s, *wwn_ci, *wwn_ci_s; + int ret = 0, lun_access; + /* + * Ensure that the source port exists + */ + if (!(lun->lun_sep) || !(lun->lun_sep->sep_tpg)) { + printk(KERN_ERR "Source se_lun->lun_sep or lun->lun_sep->sep" + "_tpg does not exist\n"); + return -EINVAL; + } + se_tpg = lun->lun_sep->sep_tpg; + + nacl_ci = &lun_acl_ci->ci_parent->ci_group->cg_item; + tpg_ci = &nacl_ci->ci_group->cg_item; + wwn_ci = &tpg_ci->ci_group->cg_item; + tpg_ci_s = &lun_ci->ci_parent->ci_group->cg_item; + wwn_ci_s = &tpg_ci_s->ci_group->cg_item; + /* + * Make sure the SymLink is going to the same $FABRIC/$WWN/tpgt_$TPGT + */ + if (strcmp(config_item_name(wwn_ci), config_item_name(wwn_ci_s))) { + printk(KERN_ERR "Illegal Initiator ACL SymLink outside of %s\n", + config_item_name(wwn_ci)); + return -EINVAL; + } + if (strcmp(config_item_name(tpg_ci), config_item_name(tpg_ci_s))) { + printk(KERN_ERR "Illegal Initiator ACL Symlink outside of %s" + " TPGT: %s\n", config_item_name(wwn_ci), + config_item_name(tpg_ci)); + return -EINVAL; + } + /* + * If this struct se_node_acl was dynamically generated with + * tpg_1/attrib/generate_node_acls=1, use the existing deve->lun_flags, + * which be will write protected (READ-ONLY) when + * tpg_1/attrib/demo_mode_write_protect=1 + */ + spin_lock_irq(&lacl->se_lun_nacl->device_list_lock); + deve = &lacl->se_lun_nacl->device_list[lacl->mapped_lun]; + if (deve->lun_flags & TRANSPORT_LUNFLAGS_INITIATOR_ACCESS) + lun_access = deve->lun_flags; + else + lun_access = + (TPG_TFO(se_tpg)->tpg_check_prod_mode_write_protect( + se_tpg)) ? TRANSPORT_LUNFLAGS_READ_ONLY : + TRANSPORT_LUNFLAGS_READ_WRITE; + spin_unlock_irq(&lacl->se_lun_nacl->device_list_lock); + /* + * Determine the actual mapped LUN value user wants.. + * + * This value is what the SCSI Initiator actually sees the + * iscsi/$IQN/$TPGT/lun/lun_* as on their SCSI Initiator Ports. + */ + ret = core_dev_add_initiator_node_lun_acl(se_tpg, lacl, + lun->unpacked_lun, lun_access); + + return (ret < 0) ? -EINVAL : 0; +} + +static int target_fabric_mappedlun_unlink( + struct config_item *lun_acl_ci, + struct config_item *lun_ci) +{ + struct se_lun *lun; + struct se_lun_acl *lacl = container_of(to_config_group(lun_acl_ci), + struct se_lun_acl, se_lun_group); + struct se_node_acl *nacl = lacl->se_lun_nacl; + struct se_dev_entry *deve = &nacl->device_list[lacl->mapped_lun]; + struct se_portal_group *se_tpg; + /* + * Determine if the underlying MappedLUN has already been released.. + */ + if (!(deve->se_lun)) + return 0; + + lun = container_of(to_config_group(lun_ci), struct se_lun, lun_group); + se_tpg = lun->lun_sep->sep_tpg; + + core_dev_del_initiator_node_lun_acl(se_tpg, lun, lacl); + return 0; +} + +CONFIGFS_EATTR_STRUCT(target_fabric_mappedlun, se_lun_acl); +#define TCM_MAPPEDLUN_ATTR(_name, _mode) \ +static struct target_fabric_mappedlun_attribute target_fabric_mappedlun_##_name = \ + __CONFIGFS_EATTR(_name, _mode, \ + target_fabric_mappedlun_show_##_name, \ + target_fabric_mappedlun_store_##_name); + +static ssize_t target_fabric_mappedlun_show_write_protect( + struct se_lun_acl *lacl, + char *page) +{ + struct se_node_acl *se_nacl = lacl->se_lun_nacl; + struct se_dev_entry *deve; + ssize_t len; + + spin_lock_irq(&se_nacl->device_list_lock); + deve = &se_nacl->device_list[lacl->mapped_lun]; + len = sprintf(page, "%d\n", + (deve->lun_flags & TRANSPORT_LUNFLAGS_READ_ONLY) ? + 1 : 0); + spin_unlock_irq(&se_nacl->device_list_lock); + + return len; +} + +static ssize_t target_fabric_mappedlun_store_write_protect( + struct se_lun_acl *lacl, + const char *page, + size_t count) +{ + struct se_node_acl *se_nacl = lacl->se_lun_nacl; + struct se_portal_group *se_tpg = se_nacl->se_tpg; + unsigned long op; + + if (strict_strtoul(page, 0, &op)) + return -EINVAL; + + if ((op != 1) && (op != 0)) + return -EINVAL; + + core_update_device_list_access(lacl->mapped_lun, (op) ? + TRANSPORT_LUNFLAGS_READ_ONLY : + TRANSPORT_LUNFLAGS_READ_WRITE, + lacl->se_lun_nacl); + + printk(KERN_INFO "%s_ConfigFS: Changed Initiator ACL: %s" + " Mapped LUN: %u Write Protect bit to %s\n", + TPG_TFO(se_tpg)->get_fabric_name(), + lacl->initiatorname, lacl->mapped_lun, (op) ? "ON" : "OFF"); + + return count; + +} + +TCM_MAPPEDLUN_ATTR(write_protect, S_IRUGO | S_IWUSR); + +CONFIGFS_EATTR_OPS(target_fabric_mappedlun, se_lun_acl, se_lun_group); + +static struct configfs_attribute *target_fabric_mappedlun_attrs[] = { + &target_fabric_mappedlun_write_protect.attr, + NULL, +}; + +static struct configfs_item_operations target_fabric_mappedlun_item_ops = { + .show_attribute = target_fabric_mappedlun_attr_show, + .store_attribute = target_fabric_mappedlun_attr_store, + .allow_link = target_fabric_mappedlun_link, + .drop_link = target_fabric_mappedlun_unlink, +}; + +TF_CIT_SETUP(tpg_mappedlun, &target_fabric_mappedlun_item_ops, NULL, + target_fabric_mappedlun_attrs); + +/* End of tfc_tpg_mappedlun_cit */ + +/* Start of tfc_tpg_nacl_attrib_cit */ + +CONFIGFS_EATTR_OPS(target_fabric_nacl_attrib, se_node_acl, acl_attrib_group); + +static struct configfs_item_operations target_fabric_nacl_attrib_item_ops = { + .show_attribute = target_fabric_nacl_attrib_attr_show, + .store_attribute = target_fabric_nacl_attrib_attr_store, +}; + +TF_CIT_SETUP(tpg_nacl_attrib, &target_fabric_nacl_attrib_item_ops, NULL, NULL); + +/* End of tfc_tpg_nacl_attrib_cit */ + +/* Start of tfc_tpg_nacl_auth_cit */ + +CONFIGFS_EATTR_OPS(target_fabric_nacl_auth, se_node_acl, acl_auth_group); + +static struct configfs_item_operations target_fabric_nacl_auth_item_ops = { + .show_attribute = target_fabric_nacl_auth_attr_show, + .store_attribute = target_fabric_nacl_auth_attr_store, +}; + +TF_CIT_SETUP(tpg_nacl_auth, &target_fabric_nacl_auth_item_ops, NULL, NULL); + +/* End of tfc_tpg_nacl_auth_cit */ + +/* Start of tfc_tpg_nacl_param_cit */ + +CONFIGFS_EATTR_OPS(target_fabric_nacl_param, se_node_acl, acl_param_group); + +static struct configfs_item_operations target_fabric_nacl_param_item_ops = { + .show_attribute = target_fabric_nacl_param_attr_show, + .store_attribute = target_fabric_nacl_param_attr_store, +}; + +TF_CIT_SETUP(tpg_nacl_param, &target_fabric_nacl_param_item_ops, NULL, NULL); + +/* End of tfc_tpg_nacl_param_cit */ + +/* Start of tfc_tpg_nacl_base_cit */ + +CONFIGFS_EATTR_OPS(target_fabric_nacl_base, se_node_acl, acl_group); + +static struct config_group *target_fabric_make_mappedlun( + struct config_group *group, + const char *name) +{ + struct se_node_acl *se_nacl = container_of(group, + struct se_node_acl, acl_group); + struct se_portal_group *se_tpg = se_nacl->se_tpg; + struct target_fabric_configfs *tf = se_tpg->se_tpg_wwn->wwn_tf; + struct se_lun_acl *lacl; + struct config_item *acl_ci; + char *buf; + unsigned long mapped_lun; + int ret = 0; + + acl_ci = &group->cg_item; + if (!(acl_ci)) { + printk(KERN_ERR "Unable to locatel acl_ci\n"); + return NULL; + } + + buf = kzalloc(strlen(name) + 1, GFP_KERNEL); + if (!(buf)) { + printk(KERN_ERR "Unable to allocate memory for name buf\n"); + return ERR_PTR(-ENOMEM); + } + snprintf(buf, strlen(name) + 1, "%s", name); + /* + * Make sure user is creating iscsi/$IQN/$TPGT/acls/$INITIATOR/lun_$ID. + */ + if (strstr(buf, "lun_") != buf) { + printk(KERN_ERR "Unable to locate \"lun_\" from buf: %s" + " name: %s\n", buf, name); + ret = -EINVAL; + goto out; + } + /* + * Determine the Mapped LUN value. This is what the SCSI Initiator + * Port will actually see. + */ + if (strict_strtoul(buf + 4, 0, &mapped_lun) || mapped_lun > UINT_MAX) { + ret = -EINVAL; + goto out; + } + + lacl = core_dev_init_initiator_node_lun_acl(se_tpg, mapped_lun, + config_item_name(acl_ci), &ret); + if (!(lacl)) + goto out; + + config_group_init_type_name(&lacl->se_lun_group, name, + &TF_CIT_TMPL(tf)->tfc_tpg_mappedlun_cit); + + kfree(buf); + return &lacl->se_lun_group; +out: + kfree(buf); + return ERR_PTR(ret); +} + +static void target_fabric_drop_mappedlun( + struct config_group *group, + struct config_item *item) +{ + struct se_lun_acl *lacl = container_of(to_config_group(item), + struct se_lun_acl, se_lun_group); + struct se_portal_group *se_tpg = lacl->se_lun_nacl->se_tpg; + + config_item_put(item); + core_dev_free_initiator_node_lun_acl(se_tpg, lacl); +} + +static struct configfs_item_operations target_fabric_nacl_base_item_ops = { + .show_attribute = target_fabric_nacl_base_attr_show, + .store_attribute = target_fabric_nacl_base_attr_store, +}; + +static struct configfs_group_operations target_fabric_nacl_base_group_ops = { + .make_group = target_fabric_make_mappedlun, + .drop_item = target_fabric_drop_mappedlun, +}; + +TF_CIT_SETUP(tpg_nacl_base, &target_fabric_nacl_base_item_ops, + &target_fabric_nacl_base_group_ops, NULL); + +/* End of tfc_tpg_nacl_base_cit */ + +/* Start of tfc_tpg_nacl_cit */ + +static struct config_group *target_fabric_make_nodeacl( + struct config_group *group, + const char *name) +{ + struct se_portal_group *se_tpg = container_of(group, + struct se_portal_group, tpg_acl_group); + struct target_fabric_configfs *tf = se_tpg->se_tpg_wwn->wwn_tf; + struct se_node_acl *se_nacl; + struct config_group *nacl_cg; + + if (!(tf->tf_ops.fabric_make_nodeacl)) { + printk(KERN_ERR "tf->tf_ops.fabric_make_nodeacl is NULL\n"); + return ERR_PTR(-ENOSYS); + } + + se_nacl = tf->tf_ops.fabric_make_nodeacl(se_tpg, group, name); + if (IS_ERR(se_nacl)) + return ERR_PTR(PTR_ERR(se_nacl)); + + nacl_cg = &se_nacl->acl_group; + nacl_cg->default_groups = se_nacl->acl_default_groups; + nacl_cg->default_groups[0] = &se_nacl->acl_attrib_group; + nacl_cg->default_groups[1] = &se_nacl->acl_auth_group; + nacl_cg->default_groups[2] = &se_nacl->acl_param_group; + nacl_cg->default_groups[3] = NULL; + + config_group_init_type_name(&se_nacl->acl_group, name, + &TF_CIT_TMPL(tf)->tfc_tpg_nacl_base_cit); + config_group_init_type_name(&se_nacl->acl_attrib_group, "attrib", + &TF_CIT_TMPL(tf)->tfc_tpg_nacl_attrib_cit); + config_group_init_type_name(&se_nacl->acl_auth_group, "auth", + &TF_CIT_TMPL(tf)->tfc_tpg_nacl_auth_cit); + config_group_init_type_name(&se_nacl->acl_param_group, "param", + &TF_CIT_TMPL(tf)->tfc_tpg_nacl_param_cit); + + return &se_nacl->acl_group; +} + +static void target_fabric_drop_nodeacl( + struct config_group *group, + struct config_item *item) +{ + struct se_portal_group *se_tpg = container_of(group, + struct se_portal_group, tpg_acl_group); + struct target_fabric_configfs *tf = se_tpg->se_tpg_wwn->wwn_tf; + struct se_node_acl *se_nacl = container_of(to_config_group(item), + struct se_node_acl, acl_group); + struct config_item *df_item; + struct config_group *nacl_cg; + int i; + + nacl_cg = &se_nacl->acl_group; + for (i = 0; nacl_cg->default_groups[i]; i++) { + df_item = &nacl_cg->default_groups[i]->cg_item; + nacl_cg->default_groups[i] = NULL; + config_item_put(df_item); + } + + config_item_put(item); + tf->tf_ops.fabric_drop_nodeacl(se_nacl); +} + +static struct configfs_group_operations target_fabric_nacl_group_ops = { + .make_group = target_fabric_make_nodeacl, + .drop_item = target_fabric_drop_nodeacl, +}; + +TF_CIT_SETUP(tpg_nacl, NULL, &target_fabric_nacl_group_ops, NULL); + +/* End of tfc_tpg_nacl_cit */ + +/* Start of tfc_tpg_np_base_cit */ + +CONFIGFS_EATTR_OPS(target_fabric_np_base, se_tpg_np, tpg_np_group); + +static struct configfs_item_operations target_fabric_np_base_item_ops = { + .show_attribute = target_fabric_np_base_attr_show, + .store_attribute = target_fabric_np_base_attr_store, +}; + +TF_CIT_SETUP(tpg_np_base, &target_fabric_np_base_item_ops, NULL, NULL); + +/* End of tfc_tpg_np_base_cit */ + +/* Start of tfc_tpg_np_cit */ + +static struct config_group *target_fabric_make_np( + struct config_group *group, + const char *name) +{ + struct se_portal_group *se_tpg = container_of(group, + struct se_portal_group, tpg_np_group); + struct target_fabric_configfs *tf = se_tpg->se_tpg_wwn->wwn_tf; + struct se_tpg_np *se_tpg_np; + + if (!(tf->tf_ops.fabric_make_np)) { + printk(KERN_ERR "tf->tf_ops.fabric_make_np is NULL\n"); + return ERR_PTR(-ENOSYS); + } + + se_tpg_np = tf->tf_ops.fabric_make_np(se_tpg, group, name); + if (!(se_tpg_np) || IS_ERR(se_tpg_np)) + return ERR_PTR(-EINVAL); + + config_group_init_type_name(&se_tpg_np->tpg_np_group, name, + &TF_CIT_TMPL(tf)->tfc_tpg_np_base_cit); + + return &se_tpg_np->tpg_np_group; +} + +static void target_fabric_drop_np( + struct config_group *group, + struct config_item *item) +{ + struct se_portal_group *se_tpg = container_of(group, + struct se_portal_group, tpg_np_group); + struct target_fabric_configfs *tf = se_tpg->se_tpg_wwn->wwn_tf; + struct se_tpg_np *se_tpg_np = container_of(to_config_group(item), + struct se_tpg_np, tpg_np_group); + + config_item_put(item); + tf->tf_ops.fabric_drop_np(se_tpg_np); +} + +static struct configfs_group_operations target_fabric_np_group_ops = { + .make_group = &target_fabric_make_np, + .drop_item = &target_fabric_drop_np, +}; + +TF_CIT_SETUP(tpg_np, NULL, &target_fabric_np_group_ops, NULL); + +/* End of tfc_tpg_np_cit */ + +/* Start of tfc_tpg_port_cit */ + +CONFIGFS_EATTR_STRUCT(target_fabric_port, se_lun); +#define TCM_PORT_ATTR(_name, _mode) \ +static struct target_fabric_port_attribute target_fabric_port_##_name = \ + __CONFIGFS_EATTR(_name, _mode, \ + target_fabric_port_show_attr_##_name, \ + target_fabric_port_store_attr_##_name); + +#define TCM_PORT_ATTOR_RO(_name) \ + __CONFIGFS_EATTR_RO(_name, \ + target_fabric_port_show_attr_##_name); + +/* + * alua_tg_pt_gp + */ +static ssize_t target_fabric_port_show_attr_alua_tg_pt_gp( + struct se_lun *lun, + char *page) +{ + if (!(lun)) + return -ENODEV; + + if (!(lun->lun_sep)) + return -ENODEV; + + return core_alua_show_tg_pt_gp_info(lun->lun_sep, page); +} + +static ssize_t target_fabric_port_store_attr_alua_tg_pt_gp( + struct se_lun *lun, + const char *page, + size_t count) +{ + if (!(lun)) + return -ENODEV; + + if (!(lun->lun_sep)) + return -ENODEV; + + return core_alua_store_tg_pt_gp_info(lun->lun_sep, page, count); +} + +TCM_PORT_ATTR(alua_tg_pt_gp, S_IRUGO | S_IWUSR); + +/* + * alua_tg_pt_offline + */ +static ssize_t target_fabric_port_show_attr_alua_tg_pt_offline( + struct se_lun *lun, + char *page) +{ + if (!(lun)) + return -ENODEV; + + if (!(lun->lun_sep)) + return -ENODEV; + + return core_alua_show_offline_bit(lun, page); +} + +static ssize_t target_fabric_port_store_attr_alua_tg_pt_offline( + struct se_lun *lun, + const char *page, + size_t count) +{ + if (!(lun)) + return -ENODEV; + + if (!(lun->lun_sep)) + return -ENODEV; + + return core_alua_store_offline_bit(lun, page, count); +} + +TCM_PORT_ATTR(alua_tg_pt_offline, S_IRUGO | S_IWUSR); + +/* + * alua_tg_pt_status + */ +static ssize_t target_fabric_port_show_attr_alua_tg_pt_status( + struct se_lun *lun, + char *page) +{ + if (!(lun)) + return -ENODEV; + + if (!(lun->lun_sep)) + return -ENODEV; + + return core_alua_show_secondary_status(lun, page); +} + +static ssize_t target_fabric_port_store_attr_alua_tg_pt_status( + struct se_lun *lun, + const char *page, + size_t count) +{ + if (!(lun)) + return -ENODEV; + + if (!(lun->lun_sep)) + return -ENODEV; + + return core_alua_store_secondary_status(lun, page, count); +} + +TCM_PORT_ATTR(alua_tg_pt_status, S_IRUGO | S_IWUSR); + +/* + * alua_tg_pt_write_md + */ +static ssize_t target_fabric_port_show_attr_alua_tg_pt_write_md( + struct se_lun *lun, + char *page) +{ + if (!(lun)) + return -ENODEV; + + if (!(lun->lun_sep)) + return -ENODEV; + + return core_alua_show_secondary_write_metadata(lun, page); +} + +static ssize_t target_fabric_port_store_attr_alua_tg_pt_write_md( + struct se_lun *lun, + const char *page, + size_t count) +{ + if (!(lun)) + return -ENODEV; + + if (!(lun->lun_sep)) + return -ENODEV; + + return core_alua_store_secondary_write_metadata(lun, page, count); +} + +TCM_PORT_ATTR(alua_tg_pt_write_md, S_IRUGO | S_IWUSR); + + +static struct configfs_attribute *target_fabric_port_attrs[] = { + &target_fabric_port_alua_tg_pt_gp.attr, + &target_fabric_port_alua_tg_pt_offline.attr, + &target_fabric_port_alua_tg_pt_status.attr, + &target_fabric_port_alua_tg_pt_write_md.attr, + NULL, +}; + +CONFIGFS_EATTR_OPS(target_fabric_port, se_lun, lun_group); + +static int target_fabric_port_link( + struct config_item *lun_ci, + struct config_item *se_dev_ci) +{ + struct config_item *tpg_ci; + struct se_device *dev; + struct se_lun *lun = container_of(to_config_group(lun_ci), + struct se_lun, lun_group); + struct se_lun *lun_p; + struct se_portal_group *se_tpg; + struct se_subsystem_dev *se_dev = container_of( + to_config_group(se_dev_ci), struct se_subsystem_dev, + se_dev_group); + struct target_fabric_configfs *tf; + int ret; + + tpg_ci = &lun_ci->ci_parent->ci_group->cg_item; + se_tpg = container_of(to_config_group(tpg_ci), + struct se_portal_group, tpg_group); + tf = se_tpg->se_tpg_wwn->wwn_tf; + + if (lun->lun_se_dev != NULL) { + printk(KERN_ERR "Port Symlink already exists\n"); + return -EEXIST; + } + + dev = se_dev->se_dev_ptr; + if (!(dev)) { + printk(KERN_ERR "Unable to locate struct se_device pointer from" + " %s\n", config_item_name(se_dev_ci)); + ret = -ENODEV; + goto out; + } + + lun_p = core_dev_add_lun(se_tpg, dev->se_hba, dev, + lun->unpacked_lun); + if ((IS_ERR(lun_p)) || !(lun_p)) { + printk(KERN_ERR "core_dev_add_lun() failed\n"); + ret = -EINVAL; + goto out; + } + + if (tf->tf_ops.fabric_post_link) { + /* + * Call the optional fabric_post_link() to allow a + * fabric module to setup any additional state once + * core_dev_add_lun() has been called.. + */ + tf->tf_ops.fabric_post_link(se_tpg, lun); + } + + return 0; +out: + return ret; +} + +static int target_fabric_port_unlink( + struct config_item *lun_ci, + struct config_item *se_dev_ci) +{ + struct se_lun *lun = container_of(to_config_group(lun_ci), + struct se_lun, lun_group); + struct se_portal_group *se_tpg = lun->lun_sep->sep_tpg; + struct target_fabric_configfs *tf = se_tpg->se_tpg_wwn->wwn_tf; + + if (tf->tf_ops.fabric_pre_unlink) { + /* + * Call the optional fabric_pre_unlink() to allow a + * fabric module to release any additional stat before + * core_dev_del_lun() is called. + */ + tf->tf_ops.fabric_pre_unlink(se_tpg, lun); + } + + core_dev_del_lun(se_tpg, lun->unpacked_lun); + return 0; +} + +static struct configfs_item_operations target_fabric_port_item_ops = { + .show_attribute = target_fabric_port_attr_show, + .store_attribute = target_fabric_port_attr_store, + .allow_link = target_fabric_port_link, + .drop_link = target_fabric_port_unlink, +}; + +TF_CIT_SETUP(tpg_port, &target_fabric_port_item_ops, NULL, target_fabric_port_attrs); + +/* End of tfc_tpg_port_cit */ + +/* Start of tfc_tpg_lun_cit */ + +static struct config_group *target_fabric_make_lun( + struct config_group *group, + const char *name) +{ + struct se_lun *lun; + struct se_portal_group *se_tpg = container_of(group, + struct se_portal_group, tpg_lun_group); + struct target_fabric_configfs *tf = se_tpg->se_tpg_wwn->wwn_tf; + unsigned long unpacked_lun; + + if (strstr(name, "lun_") != name) { + printk(KERN_ERR "Unable to locate \'_\" in" + " \"lun_$LUN_NUMBER\"\n"); + return ERR_PTR(-EINVAL); + } + if (strict_strtoul(name + 4, 0, &unpacked_lun) || unpacked_lun > UINT_MAX) + return ERR_PTR(-EINVAL); + + lun = core_get_lun_from_tpg(se_tpg, unpacked_lun); + if (!(lun)) + return ERR_PTR(-EINVAL); + + config_group_init_type_name(&lun->lun_group, name, + &TF_CIT_TMPL(tf)->tfc_tpg_port_cit); + + return &lun->lun_group; +} + +static void target_fabric_drop_lun( + struct config_group *group, + struct config_item *item) +{ + config_item_put(item); +} + +static struct configfs_group_operations target_fabric_lun_group_ops = { + .make_group = &target_fabric_make_lun, + .drop_item = &target_fabric_drop_lun, +}; + +TF_CIT_SETUP(tpg_lun, NULL, &target_fabric_lun_group_ops, NULL); + +/* End of tfc_tpg_lun_cit */ + +/* Start of tfc_tpg_attrib_cit */ + +CONFIGFS_EATTR_OPS(target_fabric_tpg_attrib, se_portal_group, tpg_attrib_group); + +static struct configfs_item_operations target_fabric_tpg_attrib_item_ops = { + .show_attribute = target_fabric_tpg_attrib_attr_show, + .store_attribute = target_fabric_tpg_attrib_attr_store, +}; + +TF_CIT_SETUP(tpg_attrib, &target_fabric_tpg_attrib_item_ops, NULL, NULL); + +/* End of tfc_tpg_attrib_cit */ + +/* Start of tfc_tpg_param_cit */ + +CONFIGFS_EATTR_OPS(target_fabric_tpg_param, se_portal_group, tpg_param_group); + +static struct configfs_item_operations target_fabric_tpg_param_item_ops = { + .show_attribute = target_fabric_tpg_param_attr_show, + .store_attribute = target_fabric_tpg_param_attr_store, +}; + +TF_CIT_SETUP(tpg_param, &target_fabric_tpg_param_item_ops, NULL, NULL); + +/* End of tfc_tpg_param_cit */ + +/* Start of tfc_tpg_base_cit */ +/* + * For use with TF_TPG_ATTR() and TF_TPG_ATTR_RO() + */ +CONFIGFS_EATTR_OPS(target_fabric_tpg, se_portal_group, tpg_group); + +static struct configfs_item_operations target_fabric_tpg_base_item_ops = { + .show_attribute = target_fabric_tpg_attr_show, + .store_attribute = target_fabric_tpg_attr_store, +}; + +TF_CIT_SETUP(tpg_base, &target_fabric_tpg_base_item_ops, NULL, NULL); + +/* End of tfc_tpg_base_cit */ + +/* Start of tfc_tpg_cit */ + +static struct config_group *target_fabric_make_tpg( + struct config_group *group, + const char *name) +{ + struct se_wwn *wwn = container_of(group, struct se_wwn, wwn_group); + struct target_fabric_configfs *tf = wwn->wwn_tf; + struct se_portal_group *se_tpg; + + if (!(tf->tf_ops.fabric_make_tpg)) { + printk(KERN_ERR "tf->tf_ops.fabric_make_tpg is NULL\n"); + return ERR_PTR(-ENOSYS); + } + + se_tpg = tf->tf_ops.fabric_make_tpg(wwn, group, name); + if (!(se_tpg) || IS_ERR(se_tpg)) + return ERR_PTR(-EINVAL); + /* + * Setup default groups from pre-allocated se_tpg->tpg_default_groups + */ + se_tpg->tpg_group.default_groups = se_tpg->tpg_default_groups; + se_tpg->tpg_group.default_groups[0] = &se_tpg->tpg_lun_group; + se_tpg->tpg_group.default_groups[1] = &se_tpg->tpg_np_group; + se_tpg->tpg_group.default_groups[2] = &se_tpg->tpg_acl_group; + se_tpg->tpg_group.default_groups[3] = &se_tpg->tpg_attrib_group; + se_tpg->tpg_group.default_groups[4] = &se_tpg->tpg_param_group; + se_tpg->tpg_group.default_groups[5] = NULL; + + config_group_init_type_name(&se_tpg->tpg_group, name, + &TF_CIT_TMPL(tf)->tfc_tpg_base_cit); + config_group_init_type_name(&se_tpg->tpg_lun_group, "lun", + &TF_CIT_TMPL(tf)->tfc_tpg_lun_cit); + config_group_init_type_name(&se_tpg->tpg_np_group, "np", + &TF_CIT_TMPL(tf)->tfc_tpg_np_cit); + config_group_init_type_name(&se_tpg->tpg_acl_group, "acls", + &TF_CIT_TMPL(tf)->tfc_tpg_nacl_cit); + config_group_init_type_name(&se_tpg->tpg_attrib_group, "attrib", + &TF_CIT_TMPL(tf)->tfc_tpg_attrib_cit); + config_group_init_type_name(&se_tpg->tpg_param_group, "param", + &TF_CIT_TMPL(tf)->tfc_tpg_param_cit); + + return &se_tpg->tpg_group; +} + +static void target_fabric_drop_tpg( + struct config_group *group, + struct config_item *item) +{ + struct se_wwn *wwn = container_of(group, struct se_wwn, wwn_group); + struct target_fabric_configfs *tf = wwn->wwn_tf; + struct se_portal_group *se_tpg = container_of(to_config_group(item), + struct se_portal_group, tpg_group); + struct config_group *tpg_cg = &se_tpg->tpg_group; + struct config_item *df_item; + int i; + /* + * Release default groups, but do not release tpg_cg->default_groups + * memory as it is statically allocated at se_tpg->tpg_default_groups. + */ + for (i = 0; tpg_cg->default_groups[i]; i++) { + df_item = &tpg_cg->default_groups[i]->cg_item; + tpg_cg->default_groups[i] = NULL; + config_item_put(df_item); + } + + config_item_put(item); + tf->tf_ops.fabric_drop_tpg(se_tpg); +} + +static struct configfs_group_operations target_fabric_tpg_group_ops = { + .make_group = target_fabric_make_tpg, + .drop_item = target_fabric_drop_tpg, +}; + +TF_CIT_SETUP(tpg, NULL, &target_fabric_tpg_group_ops, NULL); + +/* End of tfc_tpg_cit */ + +/* Start of tfc_wwn_cit */ + +static struct config_group *target_fabric_make_wwn( + struct config_group *group, + const char *name) +{ + struct target_fabric_configfs *tf = container_of(group, + struct target_fabric_configfs, tf_group); + struct se_wwn *wwn; + + if (!(tf->tf_ops.fabric_make_wwn)) { + printk(KERN_ERR "tf->tf_ops.fabric_make_wwn is NULL\n"); + return ERR_PTR(-ENOSYS); + } + + wwn = tf->tf_ops.fabric_make_wwn(tf, group, name); + if (!(wwn) || IS_ERR(wwn)) + return ERR_PTR(-EINVAL); + + wwn->wwn_tf = tf; + config_group_init_type_name(&wwn->wwn_group, name, + &TF_CIT_TMPL(tf)->tfc_tpg_cit); + + return &wwn->wwn_group; +} + +static void target_fabric_drop_wwn( + struct config_group *group, + struct config_item *item) +{ + struct target_fabric_configfs *tf = container_of(group, + struct target_fabric_configfs, tf_group); + struct se_wwn *wwn = container_of(to_config_group(item), + struct se_wwn, wwn_group); + + config_item_put(item); + tf->tf_ops.fabric_drop_wwn(wwn); +} + +static struct configfs_group_operations target_fabric_wwn_group_ops = { + .make_group = target_fabric_make_wwn, + .drop_item = target_fabric_drop_wwn, +}; +/* + * For use with TF_WWN_ATTR() and TF_WWN_ATTR_RO() + */ +CONFIGFS_EATTR_OPS(target_fabric_wwn, target_fabric_configfs, tf_group); + +static struct configfs_item_operations target_fabric_wwn_item_ops = { + .show_attribute = target_fabric_wwn_attr_show, + .store_attribute = target_fabric_wwn_attr_store, +}; + +TF_CIT_SETUP(wwn, &target_fabric_wwn_item_ops, &target_fabric_wwn_group_ops, NULL); + +/* End of tfc_wwn_cit */ + +/* Start of tfc_discovery_cit */ + +CONFIGFS_EATTR_OPS(target_fabric_discovery, target_fabric_configfs, + tf_disc_group); + +static struct configfs_item_operations target_fabric_discovery_item_ops = { + .show_attribute = target_fabric_discovery_attr_show, + .store_attribute = target_fabric_discovery_attr_store, +}; + +TF_CIT_SETUP(discovery, &target_fabric_discovery_item_ops, NULL, NULL); + +/* End of tfc_discovery_cit */ + +int target_fabric_setup_cits(struct target_fabric_configfs *tf) +{ + target_fabric_setup_discovery_cit(tf); + target_fabric_setup_wwn_cit(tf); + target_fabric_setup_tpg_cit(tf); + target_fabric_setup_tpg_base_cit(tf); + target_fabric_setup_tpg_port_cit(tf); + target_fabric_setup_tpg_lun_cit(tf); + target_fabric_setup_tpg_np_cit(tf); + target_fabric_setup_tpg_np_base_cit(tf); + target_fabric_setup_tpg_attrib_cit(tf); + target_fabric_setup_tpg_param_cit(tf); + target_fabric_setup_tpg_nacl_cit(tf); + target_fabric_setup_tpg_nacl_base_cit(tf); + target_fabric_setup_tpg_nacl_attrib_cit(tf); + target_fabric_setup_tpg_nacl_auth_cit(tf); + target_fabric_setup_tpg_nacl_param_cit(tf); + target_fabric_setup_tpg_mappedlun_cit(tf); + + return 0; +} diff --git a/drivers/target/target_core_fabric_lib.c b/drivers/target/target_core_fabric_lib.c new file mode 100644 index 000000000000..26285644e4de --- /dev/null +++ b/drivers/target/target_core_fabric_lib.c @@ -0,0 +1,451 @@ +/******************************************************************************* + * Filename: target_core_fabric_lib.c + * + * This file contains generic high level protocol identifier and PR + * handlers for TCM fabric modules + * + * Copyright (c) 2010 Rising Tide Systems, Inc. + * Copyright (c) 2010 Linux-iSCSI.org + * + * Nicholas A. Bellinger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + ******************************************************************************/ + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "target_core_hba.h" +#include "target_core_pr.h" + +/* + * Handlers for Serial Attached SCSI (SAS) + */ +u8 sas_get_fabric_proto_ident(struct se_portal_group *se_tpg) +{ + /* + * Return a SAS Serial SCSI Protocol identifier for loopback operations + * This is defined in section 7.5.1 Table 362 in spc4r17 + */ + return 0x6; +} +EXPORT_SYMBOL(sas_get_fabric_proto_ident); + +u32 sas_get_pr_transport_id( + struct se_portal_group *se_tpg, + struct se_node_acl *se_nacl, + struct t10_pr_registration *pr_reg, + int *format_code, + unsigned char *buf) +{ + unsigned char binary, *ptr; + int i; + u32 off = 4; + /* + * Set PROTOCOL IDENTIFIER to 6h for SAS + */ + buf[0] = 0x06; + /* + * From spc4r17, 7.5.4.7 TransportID for initiator ports using SCSI + * over SAS Serial SCSI Protocol + */ + ptr = &se_nacl->initiatorname[4]; /* Skip over 'naa. prefix */ + + for (i = 0; i < 16; i += 2) { + binary = transport_asciihex_to_binaryhex(&ptr[i]); + buf[off++] = binary; + } + /* + * The SAS Transport ID is a hardcoded 24-byte length + */ + return 24; +} +EXPORT_SYMBOL(sas_get_pr_transport_id); + +u32 sas_get_pr_transport_id_len( + struct se_portal_group *se_tpg, + struct se_node_acl *se_nacl, + struct t10_pr_registration *pr_reg, + int *format_code) +{ + *format_code = 0; + /* + * From spc4r17, 7.5.4.7 TransportID for initiator ports using SCSI + * over SAS Serial SCSI Protocol + * + * The SAS Transport ID is a hardcoded 24-byte length + */ + return 24; +} +EXPORT_SYMBOL(sas_get_pr_transport_id_len); + +/* + * Used for handling SCSI fabric dependent TransportIDs in SPC-3 and above + * Persistent Reservation SPEC_I_PT=1 and PROUT REGISTER_AND_MOVE operations. + */ +char *sas_parse_pr_out_transport_id( + struct se_portal_group *se_tpg, + const char *buf, + u32 *out_tid_len, + char **port_nexus_ptr) +{ + /* + * Assume the FORMAT CODE 00b from spc4r17, 7.5.4.7 TransportID + * for initiator ports using SCSI over SAS Serial SCSI Protocol + * + * The TransportID for a SAS Initiator Port is of fixed size of + * 24 bytes, and SAS does not contain a I_T nexus identifier, + * so we return the **port_nexus_ptr set to NULL. + */ + *port_nexus_ptr = NULL; + *out_tid_len = 24; + + return (char *)&buf[4]; +} +EXPORT_SYMBOL(sas_parse_pr_out_transport_id); + +/* + * Handlers for Fibre Channel Protocol (FCP) + */ +u8 fc_get_fabric_proto_ident(struct se_portal_group *se_tpg) +{ + return 0x0; /* 0 = fcp-2 per SPC4 section 7.5.1 */ +} +EXPORT_SYMBOL(fc_get_fabric_proto_ident); + +u32 fc_get_pr_transport_id_len( + struct se_portal_group *se_tpg, + struct se_node_acl *se_nacl, + struct t10_pr_registration *pr_reg, + int *format_code) +{ + *format_code = 0; + /* + * The FC Transport ID is a hardcoded 24-byte length + */ + return 24; +} +EXPORT_SYMBOL(fc_get_pr_transport_id_len); + +u32 fc_get_pr_transport_id( + struct se_portal_group *se_tpg, + struct se_node_acl *se_nacl, + struct t10_pr_registration *pr_reg, + int *format_code, + unsigned char *buf) +{ + unsigned char binary, *ptr; + int i; + u32 off = 8; + /* + * PROTOCOL IDENTIFIER is 0h for FCP-2 + * + * From spc4r17, 7.5.4.2 TransportID for initiator ports using + * SCSI over Fibre Channel + * + * We convert the ASCII formatted N Port name into a binary + * encoded TransportID. + */ + ptr = &se_nacl->initiatorname[0]; + + for (i = 0; i < 24; ) { + if (!(strncmp(&ptr[i], ":", 1))) { + i++; + continue; + } + binary = transport_asciihex_to_binaryhex(&ptr[i]); + buf[off++] = binary; + i += 2; + } + /* + * The FC Transport ID is a hardcoded 24-byte length + */ + return 24; +} +EXPORT_SYMBOL(fc_get_pr_transport_id); + +char *fc_parse_pr_out_transport_id( + struct se_portal_group *se_tpg, + const char *buf, + u32 *out_tid_len, + char **port_nexus_ptr) +{ + /* + * The TransportID for a FC N Port is of fixed size of + * 24 bytes, and FC does not contain a I_T nexus identifier, + * so we return the **port_nexus_ptr set to NULL. + */ + *port_nexus_ptr = NULL; + *out_tid_len = 24; + + return (char *)&buf[8]; +} +EXPORT_SYMBOL(fc_parse_pr_out_transport_id); + +/* + * Handlers for Internet Small Computer Systems Interface (iSCSI) + */ + +u8 iscsi_get_fabric_proto_ident(struct se_portal_group *se_tpg) +{ + /* + * This value is defined for "Internet SCSI (iSCSI)" + * in spc4r17 section 7.5.1 Table 362 + */ + return 0x5; +} +EXPORT_SYMBOL(iscsi_get_fabric_proto_ident); + +u32 iscsi_get_pr_transport_id( + struct se_portal_group *se_tpg, + struct se_node_acl *se_nacl, + struct t10_pr_registration *pr_reg, + int *format_code, + unsigned char *buf) +{ + u32 off = 4, padding = 0; + u16 len = 0; + + spin_lock_irq(&se_nacl->nacl_sess_lock); + /* + * Set PROTOCOL IDENTIFIER to 5h for iSCSI + */ + buf[0] = 0x05; + /* + * From spc4r17 Section 7.5.4.6: TransportID for initiator + * ports using SCSI over iSCSI. + * + * The null-terminated, null-padded (see 4.4.2) ISCSI NAME field + * shall contain the iSCSI name of an iSCSI initiator node (see + * RFC 3720). The first ISCSI NAME field byte containing an ASCII + * null character terminates the ISCSI NAME field without regard for + * the specified length of the iSCSI TransportID or the contents of + * the ADDITIONAL LENGTH field. + */ + len = sprintf(&buf[off], "%s", se_nacl->initiatorname); + /* + * Add Extra byte for NULL terminator + */ + len++; + /* + * If there is ISID present with the registration and *format code == 1 + * 1, use iSCSI Initiator port TransportID format. + * + * Otherwise use iSCSI Initiator device TransportID format that + * does not contain the ASCII encoded iSCSI Initiator iSID value + * provied by the iSCSi Initiator during the iSCSI login process. + */ + if ((*format_code == 1) && (pr_reg->isid_present_at_reg)) { + /* + * Set FORMAT CODE 01b for iSCSI Initiator port TransportID + * format. + */ + buf[0] |= 0x40; + /* + * From spc4r17 Section 7.5.4.6: TransportID for initiator + * ports using SCSI over iSCSI. Table 390 + * + * The SEPARATOR field shall contain the five ASCII + * characters ",i,0x". + * + * The null-terminated, null-padded ISCSI INITIATOR SESSION ID + * field shall contain the iSCSI initiator session identifier + * (see RFC 3720) in the form of ASCII characters that are the + * hexadecimal digits converted from the binary iSCSI initiator + * session identifier value. The first ISCSI INITIATOR SESSION + * ID field byte containing an ASCII null character + */ + buf[off+len] = 0x2c; off++; /* ASCII Character: "," */ + buf[off+len] = 0x69; off++; /* ASCII Character: "i" */ + buf[off+len] = 0x2c; off++; /* ASCII Character: "," */ + buf[off+len] = 0x30; off++; /* ASCII Character: "0" */ + buf[off+len] = 0x78; off++; /* ASCII Character: "x" */ + len += 5; + buf[off+len] = pr_reg->pr_reg_isid[0]; off++; + buf[off+len] = pr_reg->pr_reg_isid[1]; off++; + buf[off+len] = pr_reg->pr_reg_isid[2]; off++; + buf[off+len] = pr_reg->pr_reg_isid[3]; off++; + buf[off+len] = pr_reg->pr_reg_isid[4]; off++; + buf[off+len] = pr_reg->pr_reg_isid[5]; off++; + buf[off+len] = '\0'; off++; + len += 7; + } + spin_unlock_irq(&se_nacl->nacl_sess_lock); + /* + * The ADDITIONAL LENGTH field specifies the number of bytes that follow + * in the TransportID. The additional length shall be at least 20 and + * shall be a multiple of four. + */ + padding = ((-len) & 3); + if (padding != 0) + len += padding; + + buf[2] = ((len >> 8) & 0xff); + buf[3] = (len & 0xff); + /* + * Increment value for total payload + header length for + * full status descriptor + */ + len += 4; + + return len; +} +EXPORT_SYMBOL(iscsi_get_pr_transport_id); + +u32 iscsi_get_pr_transport_id_len( + struct se_portal_group *se_tpg, + struct se_node_acl *se_nacl, + struct t10_pr_registration *pr_reg, + int *format_code) +{ + u32 len = 0, padding = 0; + + spin_lock_irq(&se_nacl->nacl_sess_lock); + len = strlen(se_nacl->initiatorname); + /* + * Add extra byte for NULL terminator + */ + len++; + /* + * If there is ISID present with the registration, use format code: + * 01b: iSCSI Initiator port TransportID format + * + * If there is not an active iSCSI session, use format code: + * 00b: iSCSI Initiator device TransportID format + */ + if (pr_reg->isid_present_at_reg) { + len += 5; /* For ",i,0x" ASCII seperator */ + len += 7; /* For iSCSI Initiator Session ID + Null terminator */ + *format_code = 1; + } else + *format_code = 0; + spin_unlock_irq(&se_nacl->nacl_sess_lock); + /* + * The ADDITIONAL LENGTH field specifies the number of bytes that follow + * in the TransportID. The additional length shall be at least 20 and + * shall be a multiple of four. + */ + padding = ((-len) & 3); + if (padding != 0) + len += padding; + /* + * Increment value for total payload + header length for + * full status descriptor + */ + len += 4; + + return len; +} +EXPORT_SYMBOL(iscsi_get_pr_transport_id_len); + +char *iscsi_parse_pr_out_transport_id( + struct se_portal_group *se_tpg, + const char *buf, + u32 *out_tid_len, + char **port_nexus_ptr) +{ + char *p; + u32 tid_len, padding; + int i; + u16 add_len; + u8 format_code = (buf[0] & 0xc0); + /* + * Check for FORMAT CODE 00b or 01b from spc4r17, section 7.5.4.6: + * + * TransportID for initiator ports using SCSI over iSCSI, + * from Table 388 -- iSCSI TransportID formats. + * + * 00b Initiator port is identified using the world wide unique + * SCSI device name of the iSCSI initiator + * device containing the initiator port (see table 389). + * 01b Initiator port is identified using the world wide unique + * initiator port identifier (see table 390).10b to 11b + * Reserved + */ + if ((format_code != 0x00) && (format_code != 0x40)) { + printk(KERN_ERR "Illegal format code: 0x%02x for iSCSI" + " Initiator Transport ID\n", format_code); + return NULL; + } + /* + * If the caller wants the TransportID Length, we set that value for the + * entire iSCSI Tarnsport ID now. + */ + if (out_tid_len != NULL) { + add_len = ((buf[2] >> 8) & 0xff); + add_len |= (buf[3] & 0xff); + + tid_len = strlen((char *)&buf[4]); + tid_len += 4; /* Add four bytes for iSCSI Transport ID header */ + tid_len += 1; /* Add one byte for NULL terminator */ + padding = ((-tid_len) & 3); + if (padding != 0) + tid_len += padding; + + if ((add_len + 4) != tid_len) { + printk(KERN_INFO "LIO-Target Extracted add_len: %hu " + "does not match calculated tid_len: %u," + " using tid_len instead\n", add_len+4, tid_len); + *out_tid_len = tid_len; + } else + *out_tid_len = (add_len + 4); + } + /* + * Check for ',i,0x' seperator between iSCSI Name and iSCSI Initiator + * Session ID as defined in Table 390 - iSCSI initiator port TransportID + * format. + */ + if (format_code == 0x40) { + p = strstr((char *)&buf[4], ",i,0x"); + if (!(p)) { + printk(KERN_ERR "Unable to locate \",i,0x\" seperator" + " for Initiator port identifier: %s\n", + (char *)&buf[4]); + return NULL; + } + *p = '\0'; /* Terminate iSCSI Name */ + p += 5; /* Skip over ",i,0x" seperator */ + + *port_nexus_ptr = p; + /* + * Go ahead and do the lower case conversion of the received + * 12 ASCII characters representing the ISID in the TransportID + * for comparision against the running iSCSI session's ISID from + * iscsi_target.c:lio_sess_get_initiator_sid() + */ + for (i = 0; i < 12; i++) { + if (isdigit(*p)) { + p++; + continue; + } + *p = tolower(*p); + p++; + } + } + + return (char *)&buf[4]; +} +EXPORT_SYMBOL(iscsi_parse_pr_out_transport_id); diff --git a/drivers/target/target_core_file.c b/drivers/target/target_core_file.c new file mode 100644 index 000000000000..0aaca885668f --- /dev/null +++ b/drivers/target/target_core_file.c @@ -0,0 +1,688 @@ +/******************************************************************************* + * Filename: target_core_file.c + * + * This file contains the Storage Engine <-> FILEIO transport specific functions + * + * Copyright (c) 2005 PyX Technologies, Inc. + * Copyright (c) 2005-2006 SBE, Inc. All Rights Reserved. + * Copyright (c) 2007-2010 Rising Tide Systems + * Copyright (c) 2008-2010 Linux-iSCSI.org + * + * Nicholas A. Bellinger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + ******************************************************************************/ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +#include "target_core_file.h" + +#if 1 +#define DEBUG_FD_CACHE(x...) printk(x) +#else +#define DEBUG_FD_CACHE(x...) +#endif + +#if 1 +#define DEBUG_FD_FUA(x...) printk(x) +#else +#define DEBUG_FD_FUA(x...) +#endif + +static struct se_subsystem_api fileio_template; + +/* fd_attach_hba(): (Part of se_subsystem_api_t template) + * + * + */ +static int fd_attach_hba(struct se_hba *hba, u32 host_id) +{ + struct fd_host *fd_host; + + fd_host = kzalloc(sizeof(struct fd_host), GFP_KERNEL); + if (!(fd_host)) { + printk(KERN_ERR "Unable to allocate memory for struct fd_host\n"); + return -1; + } + + fd_host->fd_host_id = host_id; + + atomic_set(&hba->left_queue_depth, FD_HBA_QUEUE_DEPTH); + atomic_set(&hba->max_queue_depth, FD_HBA_QUEUE_DEPTH); + hba->hba_ptr = (void *) fd_host; + + printk(KERN_INFO "CORE_HBA[%d] - TCM FILEIO HBA Driver %s on Generic" + " Target Core Stack %s\n", hba->hba_id, FD_VERSION, + TARGET_CORE_MOD_VERSION); + printk(KERN_INFO "CORE_HBA[%d] - Attached FILEIO HBA: %u to Generic" + " Target Core with TCQ Depth: %d MaxSectors: %u\n", + hba->hba_id, fd_host->fd_host_id, + atomic_read(&hba->max_queue_depth), FD_MAX_SECTORS); + + return 0; +} + +static void fd_detach_hba(struct se_hba *hba) +{ + struct fd_host *fd_host = hba->hba_ptr; + + printk(KERN_INFO "CORE_HBA[%d] - Detached FILEIO HBA: %u from Generic" + " Target Core\n", hba->hba_id, fd_host->fd_host_id); + + kfree(fd_host); + hba->hba_ptr = NULL; +} + +static void *fd_allocate_virtdevice(struct se_hba *hba, const char *name) +{ + struct fd_dev *fd_dev; + struct fd_host *fd_host = (struct fd_host *) hba->hba_ptr; + + fd_dev = kzalloc(sizeof(struct fd_dev), GFP_KERNEL); + if (!(fd_dev)) { + printk(KERN_ERR "Unable to allocate memory for struct fd_dev\n"); + return NULL; + } + + fd_dev->fd_host = fd_host; + + printk(KERN_INFO "FILEIO: Allocated fd_dev for %p\n", name); + + return fd_dev; +} + +/* fd_create_virtdevice(): (Part of se_subsystem_api_t template) + * + * + */ +static struct se_device *fd_create_virtdevice( + struct se_hba *hba, + struct se_subsystem_dev *se_dev, + void *p) +{ + char *dev_p = NULL; + struct se_device *dev; + struct se_dev_limits dev_limits; + struct queue_limits *limits; + struct fd_dev *fd_dev = (struct fd_dev *) p; + struct fd_host *fd_host = (struct fd_host *) hba->hba_ptr; + mm_segment_t old_fs; + struct file *file; + struct inode *inode = NULL; + int dev_flags = 0, flags; + + memset(&dev_limits, 0, sizeof(struct se_dev_limits)); + + old_fs = get_fs(); + set_fs(get_ds()); + dev_p = getname(fd_dev->fd_dev_name); + set_fs(old_fs); + + if (IS_ERR(dev_p)) { + printk(KERN_ERR "getname(%s) failed: %lu\n", + fd_dev->fd_dev_name, IS_ERR(dev_p)); + goto fail; + } +#if 0 + if (di->no_create_file) + flags = O_RDWR | O_LARGEFILE; + else + flags = O_RDWR | O_CREAT | O_LARGEFILE; +#else + flags = O_RDWR | O_CREAT | O_LARGEFILE; +#endif +/* flags |= O_DIRECT; */ + /* + * If fd_buffered_io=1 has not been set explictly (the default), + * use O_SYNC to force FILEIO writes to disk. + */ + if (!(fd_dev->fbd_flags & FDBD_USE_BUFFERED_IO)) + flags |= O_SYNC; + + file = filp_open(dev_p, flags, 0600); + + if (IS_ERR(file) || !file || !file->f_dentry) { + printk(KERN_ERR "filp_open(%s) failed\n", dev_p); + goto fail; + } + fd_dev->fd_file = file; + /* + * If using a block backend with this struct file, we extract + * fd_dev->fd_[block,dev]_size from struct block_device. + * + * Otherwise, we use the passed fd_size= from configfs + */ + inode = file->f_mapping->host; + if (S_ISBLK(inode->i_mode)) { + struct request_queue *q; + /* + * Setup the local scope queue_limits from struct request_queue->limits + * to pass into transport_add_device_to_core_hba() as struct se_dev_limits. + */ + q = bdev_get_queue(inode->i_bdev); + limits = &dev_limits.limits; + limits->logical_block_size = bdev_logical_block_size(inode->i_bdev); + limits->max_hw_sectors = queue_max_hw_sectors(q); + limits->max_sectors = queue_max_sectors(q); + /* + * Determine the number of bytes from i_size_read() minus + * one (1) logical sector from underlying struct block_device + */ + fd_dev->fd_block_size = bdev_logical_block_size(inode->i_bdev); + fd_dev->fd_dev_size = (i_size_read(file->f_mapping->host) - + fd_dev->fd_block_size); + + printk(KERN_INFO "FILEIO: Using size: %llu bytes from struct" + " block_device blocks: %llu logical_block_size: %d\n", + fd_dev->fd_dev_size, + div_u64(fd_dev->fd_dev_size, fd_dev->fd_block_size), + fd_dev->fd_block_size); + } else { + if (!(fd_dev->fbd_flags & FBDF_HAS_SIZE)) { + printk(KERN_ERR "FILEIO: Missing fd_dev_size=" + " parameter, and no backing struct" + " block_device\n"); + goto fail; + } + + limits = &dev_limits.limits; + limits->logical_block_size = FD_BLOCKSIZE; + limits->max_hw_sectors = FD_MAX_SECTORS; + limits->max_sectors = FD_MAX_SECTORS; + fd_dev->fd_block_size = FD_BLOCKSIZE; + } + + dev_limits.hw_queue_depth = FD_MAX_DEVICE_QUEUE_DEPTH; + dev_limits.queue_depth = FD_DEVICE_QUEUE_DEPTH; + + dev = transport_add_device_to_core_hba(hba, &fileio_template, + se_dev, dev_flags, (void *)fd_dev, + &dev_limits, "FILEIO", FD_VERSION); + if (!(dev)) + goto fail; + + fd_dev->fd_dev_id = fd_host->fd_host_dev_id_count++; + fd_dev->fd_queue_depth = dev->queue_depth; + + printk(KERN_INFO "CORE_FILE[%u] - Added TCM FILEIO Device ID: %u at %s," + " %llu total bytes\n", fd_host->fd_host_id, fd_dev->fd_dev_id, + fd_dev->fd_dev_name, fd_dev->fd_dev_size); + + putname(dev_p); + return dev; +fail: + if (fd_dev->fd_file) { + filp_close(fd_dev->fd_file, NULL); + fd_dev->fd_file = NULL; + } + putname(dev_p); + return NULL; +} + +/* fd_free_device(): (Part of se_subsystem_api_t template) + * + * + */ +static void fd_free_device(void *p) +{ + struct fd_dev *fd_dev = (struct fd_dev *) p; + + if (fd_dev->fd_file) { + filp_close(fd_dev->fd_file, NULL); + fd_dev->fd_file = NULL; + } + + kfree(fd_dev); +} + +static inline struct fd_request *FILE_REQ(struct se_task *task) +{ + return container_of(task, struct fd_request, fd_task); +} + + +static struct se_task * +fd_alloc_task(struct se_cmd *cmd) +{ + struct fd_request *fd_req; + + fd_req = kzalloc(sizeof(struct fd_request), GFP_KERNEL); + if (!(fd_req)) { + printk(KERN_ERR "Unable to allocate struct fd_request\n"); + return NULL; + } + + fd_req->fd_dev = SE_DEV(cmd)->dev_ptr; + + return &fd_req->fd_task; +} + +static int fd_do_readv(struct se_task *task) +{ + struct fd_request *req = FILE_REQ(task); + struct file *fd = req->fd_dev->fd_file; + struct scatterlist *sg = task->task_sg; + struct iovec *iov; + mm_segment_t old_fs; + loff_t pos = (task->task_lba * DEV_ATTRIB(task->se_dev)->block_size); + int ret = 0, i; + + iov = kzalloc(sizeof(struct iovec) * task->task_sg_num, GFP_KERNEL); + if (!(iov)) { + printk(KERN_ERR "Unable to allocate fd_do_readv iov[]\n"); + return -1; + } + + for (i = 0; i < task->task_sg_num; i++) { + iov[i].iov_len = sg[i].length; + iov[i].iov_base = sg_virt(&sg[i]); + } + + old_fs = get_fs(); + set_fs(get_ds()); + ret = vfs_readv(fd, &iov[0], task->task_sg_num, &pos); + set_fs(old_fs); + + kfree(iov); + /* + * Return zeros and GOOD status even if the READ did not return + * the expected virt_size for struct file w/o a backing struct + * block_device. + */ + if (S_ISBLK(fd->f_dentry->d_inode->i_mode)) { + if (ret < 0 || ret != task->task_size) { + printk(KERN_ERR "vfs_readv() returned %d," + " expecting %d for S_ISBLK\n", ret, + (int)task->task_size); + return -1; + } + } else { + if (ret < 0) { + printk(KERN_ERR "vfs_readv() returned %d for non" + " S_ISBLK\n", ret); + return -1; + } + } + + return 1; +} + +static int fd_do_writev(struct se_task *task) +{ + struct fd_request *req = FILE_REQ(task); + struct file *fd = req->fd_dev->fd_file; + struct scatterlist *sg = task->task_sg; + struct iovec *iov; + mm_segment_t old_fs; + loff_t pos = (task->task_lba * DEV_ATTRIB(task->se_dev)->block_size); + int ret, i = 0; + + iov = kzalloc(sizeof(struct iovec) * task->task_sg_num, GFP_KERNEL); + if (!(iov)) { + printk(KERN_ERR "Unable to allocate fd_do_writev iov[]\n"); + return -1; + } + + for (i = 0; i < task->task_sg_num; i++) { + iov[i].iov_len = sg[i].length; + iov[i].iov_base = sg_virt(&sg[i]); + } + + old_fs = get_fs(); + set_fs(get_ds()); + ret = vfs_writev(fd, &iov[0], task->task_sg_num, &pos); + set_fs(old_fs); + + kfree(iov); + + if (ret < 0 || ret != task->task_size) { + printk(KERN_ERR "vfs_writev() returned %d\n", ret); + return -1; + } + + return 1; +} + +static void fd_emulate_sync_cache(struct se_task *task) +{ + struct se_cmd *cmd = TASK_CMD(task); + struct se_device *dev = cmd->se_dev; + struct fd_dev *fd_dev = dev->dev_ptr; + int immed = (cmd->t_task->t_task_cdb[1] & 0x2); + loff_t start, end; + int ret; + + /* + * If the Immediate bit is set, queue up the GOOD response + * for this SYNCHRONIZE_CACHE op + */ + if (immed) + transport_complete_sync_cache(cmd, 1); + + /* + * Determine if we will be flushing the entire device. + */ + if (cmd->t_task->t_task_lba == 0 && cmd->data_length == 0) { + start = 0; + end = LLONG_MAX; + } else { + start = cmd->t_task->t_task_lba * DEV_ATTRIB(dev)->block_size; + if (cmd->data_length) + end = start + cmd->data_length; + else + end = LLONG_MAX; + } + + ret = vfs_fsync_range(fd_dev->fd_file, start, end, 1); + if (ret != 0) + printk(KERN_ERR "FILEIO: vfs_fsync_range() failed: %d\n", ret); + + if (!immed) + transport_complete_sync_cache(cmd, ret == 0); +} + +/* + * Tell TCM Core that we are capable of WriteCache emulation for + * an underlying struct se_device. + */ +static int fd_emulated_write_cache(struct se_device *dev) +{ + return 1; +} + +static int fd_emulated_dpo(struct se_device *dev) +{ + return 0; +} +/* + * Tell TCM Core that we will be emulating Forced Unit Access (FUA) for WRITEs + * for TYPE_DISK. + */ +static int fd_emulated_fua_write(struct se_device *dev) +{ + return 1; +} + +static int fd_emulated_fua_read(struct se_device *dev) +{ + return 0; +} + +/* + * WRITE Force Unit Access (FUA) emulation on a per struct se_task + * LBA range basis.. + */ +static void fd_emulate_write_fua(struct se_cmd *cmd, struct se_task *task) +{ + struct se_device *dev = cmd->se_dev; + struct fd_dev *fd_dev = dev->dev_ptr; + loff_t start = task->task_lba * DEV_ATTRIB(dev)->block_size; + loff_t end = start + task->task_size; + int ret; + + DEBUG_FD_CACHE("FILEIO: FUA WRITE LBA: %llu, bytes: %u\n", + task->task_lba, task->task_size); + + ret = vfs_fsync_range(fd_dev->fd_file, start, end, 1); + if (ret != 0) + printk(KERN_ERR "FILEIO: vfs_fsync_range() failed: %d\n", ret); +} + +static int fd_do_task(struct se_task *task) +{ + struct se_cmd *cmd = task->task_se_cmd; + struct se_device *dev = cmd->se_dev; + int ret = 0; + + /* + * Call vectorized fileio functions to map struct scatterlist + * physical memory addresses to struct iovec virtual memory. + */ + if (task->task_data_direction == DMA_FROM_DEVICE) { + ret = fd_do_readv(task); + } else { + ret = fd_do_writev(task); + + if (ret > 0 && + DEV_ATTRIB(dev)->emulate_write_cache > 0 && + DEV_ATTRIB(dev)->emulate_fua_write > 0 && + T_TASK(cmd)->t_tasks_fua) { + /* + * We might need to be a bit smarter here + * and return some sense data to let the initiator + * know the FUA WRITE cache sync failed..? + */ + fd_emulate_write_fua(cmd, task); + } + + } + + if (ret < 0) + return ret; + if (ret) { + task->task_scsi_status = GOOD; + transport_complete_task(task, 1); + } + return PYX_TRANSPORT_SENT_TO_TRANSPORT; +} + +/* fd_free_task(): (Part of se_subsystem_api_t template) + * + * + */ +static void fd_free_task(struct se_task *task) +{ + struct fd_request *req = FILE_REQ(task); + + kfree(req); +} + +enum { + Opt_fd_dev_name, Opt_fd_dev_size, Opt_fd_buffered_io, Opt_err +}; + +static match_table_t tokens = { + {Opt_fd_dev_name, "fd_dev_name=%s"}, + {Opt_fd_dev_size, "fd_dev_size=%s"}, + {Opt_fd_buffered_io, "fd_buffered_id=%d"}, + {Opt_err, NULL} +}; + +static ssize_t fd_set_configfs_dev_params( + struct se_hba *hba, + struct se_subsystem_dev *se_dev, + const char *page, ssize_t count) +{ + struct fd_dev *fd_dev = se_dev->se_dev_su_ptr; + char *orig, *ptr, *arg_p, *opts; + substring_t args[MAX_OPT_ARGS]; + int ret = 0, arg, token; + + opts = kstrdup(page, GFP_KERNEL); + if (!opts) + return -ENOMEM; + + orig = opts; + + while ((ptr = strsep(&opts, ",")) != NULL) { + if (!*ptr) + continue; + + token = match_token(ptr, tokens, args); + switch (token) { + case Opt_fd_dev_name: + snprintf(fd_dev->fd_dev_name, FD_MAX_DEV_NAME, + "%s", match_strdup(&args[0])); + printk(KERN_INFO "FILEIO: Referencing Path: %s\n", + fd_dev->fd_dev_name); + fd_dev->fbd_flags |= FBDF_HAS_PATH; + break; + case Opt_fd_dev_size: + arg_p = match_strdup(&args[0]); + ret = strict_strtoull(arg_p, 0, &fd_dev->fd_dev_size); + if (ret < 0) { + printk(KERN_ERR "strict_strtoull() failed for" + " fd_dev_size=\n"); + goto out; + } + printk(KERN_INFO "FILEIO: Referencing Size: %llu" + " bytes\n", fd_dev->fd_dev_size); + fd_dev->fbd_flags |= FBDF_HAS_SIZE; + break; + case Opt_fd_buffered_io: + match_int(args, &arg); + if (arg != 1) { + printk(KERN_ERR "bogus fd_buffered_io=%d value\n", arg); + ret = -EINVAL; + goto out; + } + + printk(KERN_INFO "FILEIO: Using buffered I/O" + " operations for struct fd_dev\n"); + + fd_dev->fbd_flags |= FDBD_USE_BUFFERED_IO; + break; + default: + break; + } + } + +out: + kfree(orig); + return (!ret) ? count : ret; +} + +static ssize_t fd_check_configfs_dev_params(struct se_hba *hba, struct se_subsystem_dev *se_dev) +{ + struct fd_dev *fd_dev = (struct fd_dev *) se_dev->se_dev_su_ptr; + + if (!(fd_dev->fbd_flags & FBDF_HAS_PATH)) { + printk(KERN_ERR "Missing fd_dev_name=\n"); + return -1; + } + + return 0; +} + +static ssize_t fd_show_configfs_dev_params( + struct se_hba *hba, + struct se_subsystem_dev *se_dev, + char *b) +{ + struct fd_dev *fd_dev = se_dev->se_dev_su_ptr; + ssize_t bl = 0; + + bl = sprintf(b + bl, "TCM FILEIO ID: %u", fd_dev->fd_dev_id); + bl += sprintf(b + bl, " File: %s Size: %llu Mode: %s\n", + fd_dev->fd_dev_name, fd_dev->fd_dev_size, + (fd_dev->fbd_flags & FDBD_USE_BUFFERED_IO) ? + "Buffered" : "Synchronous"); + return bl; +} + +/* fd_get_cdb(): (Part of se_subsystem_api_t template) + * + * + */ +static unsigned char *fd_get_cdb(struct se_task *task) +{ + struct fd_request *req = FILE_REQ(task); + + return req->fd_scsi_cdb; +} + +/* fd_get_device_rev(): (Part of se_subsystem_api_t template) + * + * + */ +static u32 fd_get_device_rev(struct se_device *dev) +{ + return SCSI_SPC_2; /* Returns SPC-3 in Initiator Data */ +} + +/* fd_get_device_type(): (Part of se_subsystem_api_t template) + * + * + */ +static u32 fd_get_device_type(struct se_device *dev) +{ + return TYPE_DISK; +} + +static sector_t fd_get_blocks(struct se_device *dev) +{ + struct fd_dev *fd_dev = dev->dev_ptr; + unsigned long long blocks_long = div_u64(fd_dev->fd_dev_size, + DEV_ATTRIB(dev)->block_size); + + return blocks_long; +} + +static struct se_subsystem_api fileio_template = { + .name = "fileio", + .owner = THIS_MODULE, + .transport_type = TRANSPORT_PLUGIN_VHBA_PDEV, + .attach_hba = fd_attach_hba, + .detach_hba = fd_detach_hba, + .allocate_virtdevice = fd_allocate_virtdevice, + .create_virtdevice = fd_create_virtdevice, + .free_device = fd_free_device, + .dpo_emulated = fd_emulated_dpo, + .fua_write_emulated = fd_emulated_fua_write, + .fua_read_emulated = fd_emulated_fua_read, + .write_cache_emulated = fd_emulated_write_cache, + .alloc_task = fd_alloc_task, + .do_task = fd_do_task, + .do_sync_cache = fd_emulate_sync_cache, + .free_task = fd_free_task, + .check_configfs_dev_params = fd_check_configfs_dev_params, + .set_configfs_dev_params = fd_set_configfs_dev_params, + .show_configfs_dev_params = fd_show_configfs_dev_params, + .get_cdb = fd_get_cdb, + .get_device_rev = fd_get_device_rev, + .get_device_type = fd_get_device_type, + .get_blocks = fd_get_blocks, +}; + +static int __init fileio_module_init(void) +{ + return transport_subsystem_register(&fileio_template); +} + +static void fileio_module_exit(void) +{ + transport_subsystem_release(&fileio_template); +} + +MODULE_DESCRIPTION("TCM FILEIO subsystem plugin"); +MODULE_AUTHOR("nab@Linux-iSCSI.org"); +MODULE_LICENSE("GPL"); + +module_init(fileio_module_init); +module_exit(fileio_module_exit); diff --git a/drivers/target/target_core_file.h b/drivers/target/target_core_file.h new file mode 100644 index 000000000000..ef4de2b4bd46 --- /dev/null +++ b/drivers/target/target_core_file.h @@ -0,0 +1,50 @@ +#ifndef TARGET_CORE_FILE_H +#define TARGET_CORE_FILE_H + +#define FD_VERSION "4.0" + +#define FD_MAX_DEV_NAME 256 +/* Maximum queuedepth for the FILEIO HBA */ +#define FD_HBA_QUEUE_DEPTH 256 +#define FD_DEVICE_QUEUE_DEPTH 32 +#define FD_MAX_DEVICE_QUEUE_DEPTH 128 +#define FD_BLOCKSIZE 512 +#define FD_MAX_SECTORS 1024 + +#define RRF_EMULATE_CDB 0x01 +#define RRF_GOT_LBA 0x02 + +struct fd_request { + struct se_task fd_task; + /* SCSI CDB from iSCSI Command PDU */ + unsigned char fd_scsi_cdb[TCM_MAX_COMMAND_SIZE]; + /* FILEIO device */ + struct fd_dev *fd_dev; +} ____cacheline_aligned; + +#define FBDF_HAS_PATH 0x01 +#define FBDF_HAS_SIZE 0x02 +#define FDBD_USE_BUFFERED_IO 0x04 + +struct fd_dev { + u32 fbd_flags; + unsigned char fd_dev_name[FD_MAX_DEV_NAME]; + /* Unique Ramdisk Device ID in Ramdisk HBA */ + u32 fd_dev_id; + /* Number of SG tables in sg_table_array */ + u32 fd_table_count; + u32 fd_queue_depth; + u32 fd_block_size; + unsigned long long fd_dev_size; + struct file *fd_file; + /* FILEIO HBA device is connected to */ + struct fd_host *fd_host; +} ____cacheline_aligned; + +struct fd_host { + u32 fd_host_dev_id_count; + /* Unique FILEIO Host ID */ + u32 fd_host_id; +} ____cacheline_aligned; + +#endif /* TARGET_CORE_FILE_H */ diff --git a/drivers/target/target_core_hba.c b/drivers/target/target_core_hba.c new file mode 100644 index 000000000000..4bbe8208b241 --- /dev/null +++ b/drivers/target/target_core_hba.c @@ -0,0 +1,185 @@ +/******************************************************************************* + * Filename: target_core_hba.c + * + * This file copntains the iSCSI HBA Transport related functions. + * + * Copyright (c) 2003, 2004, 2005 PyX Technologies, Inc. + * Copyright (c) 2005, 2006, 2007 SBE, Inc. + * Copyright (c) 2007-2010 Rising Tide Systems + * Copyright (c) 2008-2010 Linux-iSCSI.org + * + * Nicholas A. Bellinger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + ******************************************************************************/ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "target_core_hba.h" + +static LIST_HEAD(subsystem_list); +static DEFINE_MUTEX(subsystem_mutex); + +int transport_subsystem_register(struct se_subsystem_api *sub_api) +{ + struct se_subsystem_api *s; + + INIT_LIST_HEAD(&sub_api->sub_api_list); + + mutex_lock(&subsystem_mutex); + list_for_each_entry(s, &subsystem_list, sub_api_list) { + if (!(strcmp(s->name, sub_api->name))) { + printk(KERN_ERR "%p is already registered with" + " duplicate name %s, unable to process" + " request\n", s, s->name); + mutex_unlock(&subsystem_mutex); + return -EEXIST; + } + } + list_add_tail(&sub_api->sub_api_list, &subsystem_list); + mutex_unlock(&subsystem_mutex); + + printk(KERN_INFO "TCM: Registered subsystem plugin: %s struct module:" + " %p\n", sub_api->name, sub_api->owner); + return 0; +} +EXPORT_SYMBOL(transport_subsystem_register); + +void transport_subsystem_release(struct se_subsystem_api *sub_api) +{ + mutex_lock(&subsystem_mutex); + list_del(&sub_api->sub_api_list); + mutex_unlock(&subsystem_mutex); +} +EXPORT_SYMBOL(transport_subsystem_release); + +static struct se_subsystem_api *core_get_backend(const char *sub_name) +{ + struct se_subsystem_api *s; + + mutex_lock(&subsystem_mutex); + list_for_each_entry(s, &subsystem_list, sub_api_list) { + if (!strcmp(s->name, sub_name)) + goto found; + } + mutex_unlock(&subsystem_mutex); + return NULL; +found: + if (s->owner && !try_module_get(s->owner)) + s = NULL; + mutex_unlock(&subsystem_mutex); + return s; +} + +struct se_hba * +core_alloc_hba(const char *plugin_name, u32 plugin_dep_id, u32 hba_flags) +{ + struct se_hba *hba; + int ret = 0; + + hba = kzalloc(sizeof(*hba), GFP_KERNEL); + if (!hba) { + printk(KERN_ERR "Unable to allocate struct se_hba\n"); + return ERR_PTR(-ENOMEM); + } + + INIT_LIST_HEAD(&hba->hba_dev_list); + spin_lock_init(&hba->device_lock); + spin_lock_init(&hba->hba_queue_lock); + mutex_init(&hba->hba_access_mutex); + + hba->hba_index = scsi_get_new_index(SCSI_INST_INDEX); + hba->hba_flags |= hba_flags; + + atomic_set(&hba->max_queue_depth, 0); + atomic_set(&hba->left_queue_depth, 0); + + hba->transport = core_get_backend(plugin_name); + if (!hba->transport) { + ret = -EINVAL; + goto out_free_hba; + } + + ret = hba->transport->attach_hba(hba, plugin_dep_id); + if (ret < 0) + goto out_module_put; + + spin_lock(&se_global->hba_lock); + hba->hba_id = se_global->g_hba_id_counter++; + list_add_tail(&hba->hba_list, &se_global->g_hba_list); + spin_unlock(&se_global->hba_lock); + + printk(KERN_INFO "CORE_HBA[%d] - Attached HBA to Generic Target" + " Core\n", hba->hba_id); + + return hba; + +out_module_put: + if (hba->transport->owner) + module_put(hba->transport->owner); + hba->transport = NULL; +out_free_hba: + kfree(hba); + return ERR_PTR(ret); +} + +int +core_delete_hba(struct se_hba *hba) +{ + struct se_device *dev, *dev_tmp; + + spin_lock(&hba->device_lock); + list_for_each_entry_safe(dev, dev_tmp, &hba->hba_dev_list, dev_list) { + + se_clear_dev_ports(dev); + spin_unlock(&hba->device_lock); + + se_release_device_for_hba(dev); + + spin_lock(&hba->device_lock); + } + spin_unlock(&hba->device_lock); + + hba->transport->detach_hba(hba); + + spin_lock(&se_global->hba_lock); + list_del(&hba->hba_list); + spin_unlock(&se_global->hba_lock); + + printk(KERN_INFO "CORE_HBA[%d] - Detached HBA from Generic Target" + " Core\n", hba->hba_id); + + if (hba->transport->owner) + module_put(hba->transport->owner); + + hba->transport = NULL; + kfree(hba); + return 0; +} diff --git a/drivers/target/target_core_hba.h b/drivers/target/target_core_hba.h new file mode 100644 index 000000000000..bb0fea5f730c --- /dev/null +++ b/drivers/target/target_core_hba.h @@ -0,0 +1,7 @@ +#ifndef TARGET_CORE_HBA_H +#define TARGET_CORE_HBA_H + +extern struct se_hba *core_alloc_hba(const char *, u32, u32); +extern int core_delete_hba(struct se_hba *); + +#endif /* TARGET_CORE_HBA_H */ diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c new file mode 100644 index 000000000000..c6e0d757e76e --- /dev/null +++ b/drivers/target/target_core_iblock.c @@ -0,0 +1,808 @@ +/******************************************************************************* + * Filename: target_core_iblock.c + * + * This file contains the Storage Engine <-> Linux BlockIO transport + * specific functions. + * + * Copyright (c) 2003, 2004, 2005 PyX Technologies, Inc. + * Copyright (c) 2005, 2006, 2007 SBE, Inc. + * Copyright (c) 2007-2010 Rising Tide Systems + * Copyright (c) 2008-2010 Linux-iSCSI.org + * + * Nicholas A. Bellinger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + ******************************************************************************/ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +#include "target_core_iblock.h" + +#if 0 +#define DEBUG_IBLOCK(x...) printk(x) +#else +#define DEBUG_IBLOCK(x...) +#endif + +static struct se_subsystem_api iblock_template; + +static void iblock_bio_done(struct bio *, int); + +/* iblock_attach_hba(): (Part of se_subsystem_api_t template) + * + * + */ +static int iblock_attach_hba(struct se_hba *hba, u32 host_id) +{ + struct iblock_hba *ib_host; + + ib_host = kzalloc(sizeof(struct iblock_hba), GFP_KERNEL); + if (!(ib_host)) { + printk(KERN_ERR "Unable to allocate memory for" + " struct iblock_hba\n"); + return -ENOMEM; + } + + ib_host->iblock_host_id = host_id; + + atomic_set(&hba->left_queue_depth, IBLOCK_HBA_QUEUE_DEPTH); + atomic_set(&hba->max_queue_depth, IBLOCK_HBA_QUEUE_DEPTH); + hba->hba_ptr = (void *) ib_host; + + printk(KERN_INFO "CORE_HBA[%d] - TCM iBlock HBA Driver %s on" + " Generic Target Core Stack %s\n", hba->hba_id, + IBLOCK_VERSION, TARGET_CORE_MOD_VERSION); + + printk(KERN_INFO "CORE_HBA[%d] - Attached iBlock HBA: %u to Generic" + " Target Core TCQ Depth: %d\n", hba->hba_id, + ib_host->iblock_host_id, atomic_read(&hba->max_queue_depth)); + + return 0; +} + +static void iblock_detach_hba(struct se_hba *hba) +{ + struct iblock_hba *ib_host = hba->hba_ptr; + + printk(KERN_INFO "CORE_HBA[%d] - Detached iBlock HBA: %u from Generic" + " Target Core\n", hba->hba_id, ib_host->iblock_host_id); + + kfree(ib_host); + hba->hba_ptr = NULL; +} + +static void *iblock_allocate_virtdevice(struct se_hba *hba, const char *name) +{ + struct iblock_dev *ib_dev = NULL; + struct iblock_hba *ib_host = hba->hba_ptr; + + ib_dev = kzalloc(sizeof(struct iblock_dev), GFP_KERNEL); + if (!(ib_dev)) { + printk(KERN_ERR "Unable to allocate struct iblock_dev\n"); + return NULL; + } + ib_dev->ibd_host = ib_host; + + printk(KERN_INFO "IBLOCK: Allocated ib_dev for %s\n", name); + + return ib_dev; +} + +static struct se_device *iblock_create_virtdevice( + struct se_hba *hba, + struct se_subsystem_dev *se_dev, + void *p) +{ + struct iblock_dev *ib_dev = p; + struct se_device *dev; + struct se_dev_limits dev_limits; + struct block_device *bd = NULL; + struct request_queue *q; + struct queue_limits *limits; + u32 dev_flags = 0; + + if (!(ib_dev)) { + printk(KERN_ERR "Unable to locate struct iblock_dev parameter\n"); + return 0; + } + memset(&dev_limits, 0, sizeof(struct se_dev_limits)); + /* + * These settings need to be made tunable.. + */ + ib_dev->ibd_bio_set = bioset_create(32, 64); + if (!(ib_dev->ibd_bio_set)) { + printk(KERN_ERR "IBLOCK: Unable to create bioset()\n"); + return 0; + } + printk(KERN_INFO "IBLOCK: Created bio_set()\n"); + /* + * iblock_check_configfs_dev_params() ensures that ib_dev->ibd_udev_path + * must already have been set in order for echo 1 > $HBA/$DEV/enable to run. + */ + printk(KERN_INFO "IBLOCK: Claiming struct block_device: %s\n", + ib_dev->ibd_udev_path); + + bd = blkdev_get_by_path(ib_dev->ibd_udev_path, + FMODE_WRITE|FMODE_READ|FMODE_EXCL, ib_dev); + if (!(bd)) + goto failed; + /* + * Setup the local scope queue_limits from struct request_queue->limits + * to pass into transport_add_device_to_core_hba() as struct se_dev_limits. + */ + q = bdev_get_queue(bd); + limits = &dev_limits.limits; + limits->logical_block_size = bdev_logical_block_size(bd); + limits->max_hw_sectors = queue_max_hw_sectors(q); + limits->max_sectors = queue_max_sectors(q); + dev_limits.hw_queue_depth = IBLOCK_MAX_DEVICE_QUEUE_DEPTH; + dev_limits.queue_depth = IBLOCK_DEVICE_QUEUE_DEPTH; + + ib_dev->ibd_major = MAJOR(bd->bd_dev); + ib_dev->ibd_minor = MINOR(bd->bd_dev); + ib_dev->ibd_bd = bd; + + dev = transport_add_device_to_core_hba(hba, + &iblock_template, se_dev, dev_flags, (void *)ib_dev, + &dev_limits, "IBLOCK", IBLOCK_VERSION); + if (!(dev)) + goto failed; + + ib_dev->ibd_depth = dev->queue_depth; + + /* + * Check if the underlying struct block_device request_queue supports + * the QUEUE_FLAG_DISCARD bit for UNMAP/WRITE_SAME in SCSI + TRIM + * in ATA and we need to set TPE=1 + */ + if (blk_queue_discard(bdev_get_queue(bd))) { + struct request_queue *q = bdev_get_queue(bd); + + DEV_ATTRIB(dev)->max_unmap_lba_count = + q->limits.max_discard_sectors; + /* + * Currently hardcoded to 1 in Linux/SCSI code.. + */ + DEV_ATTRIB(dev)->max_unmap_block_desc_count = 1; + DEV_ATTRIB(dev)->unmap_granularity = + q->limits.discard_granularity; + DEV_ATTRIB(dev)->unmap_granularity_alignment = + q->limits.discard_alignment; + + printk(KERN_INFO "IBLOCK: BLOCK Discard support available," + " disabled by default\n"); + } + + return dev; + +failed: + if (ib_dev->ibd_bio_set) { + bioset_free(ib_dev->ibd_bio_set); + ib_dev->ibd_bio_set = NULL; + } + ib_dev->ibd_bd = NULL; + ib_dev->ibd_major = 0; + ib_dev->ibd_minor = 0; + return NULL; +} + +static void iblock_free_device(void *p) +{ + struct iblock_dev *ib_dev = p; + + blkdev_put(ib_dev->ibd_bd, FMODE_WRITE|FMODE_READ|FMODE_EXCL); + bioset_free(ib_dev->ibd_bio_set); + kfree(ib_dev); +} + +static inline struct iblock_req *IBLOCK_REQ(struct se_task *task) +{ + return container_of(task, struct iblock_req, ib_task); +} + +static struct se_task * +iblock_alloc_task(struct se_cmd *cmd) +{ + struct iblock_req *ib_req; + + ib_req = kzalloc(sizeof(struct iblock_req), GFP_KERNEL); + if (!(ib_req)) { + printk(KERN_ERR "Unable to allocate memory for struct iblock_req\n"); + return NULL; + } + + ib_req->ib_dev = SE_DEV(cmd)->dev_ptr; + atomic_set(&ib_req->ib_bio_cnt, 0); + return &ib_req->ib_task; +} + +static unsigned long long iblock_emulate_read_cap_with_block_size( + struct se_device *dev, + struct block_device *bd, + struct request_queue *q) +{ + unsigned long long blocks_long = (div_u64(i_size_read(bd->bd_inode), + bdev_logical_block_size(bd)) - 1); + u32 block_size = bdev_logical_block_size(bd); + + if (block_size == DEV_ATTRIB(dev)->block_size) + return blocks_long; + + switch (block_size) { + case 4096: + switch (DEV_ATTRIB(dev)->block_size) { + case 2048: + blocks_long <<= 1; + break; + case 1024: + blocks_long <<= 2; + break; + case 512: + blocks_long <<= 3; + default: + break; + } + break; + case 2048: + switch (DEV_ATTRIB(dev)->block_size) { + case 4096: + blocks_long >>= 1; + break; + case 1024: + blocks_long <<= 1; + break; + case 512: + blocks_long <<= 2; + break; + default: + break; + } + break; + case 1024: + switch (DEV_ATTRIB(dev)->block_size) { + case 4096: + blocks_long >>= 2; + break; + case 2048: + blocks_long >>= 1; + break; + case 512: + blocks_long <<= 1; + break; + default: + break; + } + break; + case 512: + switch (DEV_ATTRIB(dev)->block_size) { + case 4096: + blocks_long >>= 3; + break; + case 2048: + blocks_long >>= 2; + break; + case 1024: + blocks_long >>= 1; + break; + default: + break; + } + break; + default: + break; + } + + return blocks_long; +} + +/* + * Emulate SYCHRONIZE_CACHE_* + */ +static void iblock_emulate_sync_cache(struct se_task *task) +{ + struct se_cmd *cmd = TASK_CMD(task); + struct iblock_dev *ib_dev = cmd->se_dev->dev_ptr; + int immed = (T_TASK(cmd)->t_task_cdb[1] & 0x2); + sector_t error_sector; + int ret; + + /* + * If the Immediate bit is set, queue up the GOOD response + * for this SYNCHRONIZE_CACHE op + */ + if (immed) + transport_complete_sync_cache(cmd, 1); + + /* + * blkdev_issue_flush() does not support a specifying a range, so + * we have to flush the entire cache. + */ + ret = blkdev_issue_flush(ib_dev->ibd_bd, GFP_KERNEL, &error_sector); + if (ret != 0) { + printk(KERN_ERR "IBLOCK: block_issue_flush() failed: %d " + " error_sector: %llu\n", ret, + (unsigned long long)error_sector); + } + + if (!immed) + transport_complete_sync_cache(cmd, ret == 0); +} + +/* + * Tell TCM Core that we are capable of WriteCache emulation for + * an underlying struct se_device. + */ +static int iblock_emulated_write_cache(struct se_device *dev) +{ + return 1; +} + +static int iblock_emulated_dpo(struct se_device *dev) +{ + return 0; +} + +/* + * Tell TCM Core that we will be emulating Forced Unit Access (FUA) for WRITEs + * for TYPE_DISK. + */ +static int iblock_emulated_fua_write(struct se_device *dev) +{ + return 1; +} + +static int iblock_emulated_fua_read(struct se_device *dev) +{ + return 0; +} + +static int iblock_do_task(struct se_task *task) +{ + struct se_device *dev = task->task_se_cmd->se_dev; + struct iblock_req *req = IBLOCK_REQ(task); + struct iblock_dev *ibd = (struct iblock_dev *)req->ib_dev; + struct request_queue *q = bdev_get_queue(ibd->ibd_bd); + struct bio *bio = req->ib_bio, *nbio = NULL; + int rw; + + if (task->task_data_direction == DMA_TO_DEVICE) { + /* + * Force data to disk if we pretend to not have a volatile + * write cache, or the initiator set the Force Unit Access bit. + */ + if (DEV_ATTRIB(dev)->emulate_write_cache == 0 || + (DEV_ATTRIB(dev)->emulate_fua_write > 0 && + T_TASK(task->task_se_cmd)->t_tasks_fua)) + rw = WRITE_FUA; + else + rw = WRITE; + } else { + rw = READ; + } + + while (bio) { + nbio = bio->bi_next; + bio->bi_next = NULL; + DEBUG_IBLOCK("Calling submit_bio() task: %p bio: %p" + " bio->bi_sector: %llu\n", task, bio, bio->bi_sector); + + submit_bio(rw, bio); + bio = nbio; + } + + if (q->unplug_fn) + q->unplug_fn(q); + return PYX_TRANSPORT_SENT_TO_TRANSPORT; +} + +static int iblock_do_discard(struct se_device *dev, sector_t lba, u32 range) +{ + struct iblock_dev *ibd = dev->dev_ptr; + struct block_device *bd = ibd->ibd_bd; + int barrier = 0; + + return blkdev_issue_discard(bd, lba, range, GFP_KERNEL, barrier); +} + +static void iblock_free_task(struct se_task *task) +{ + struct iblock_req *req = IBLOCK_REQ(task); + struct bio *bio, *hbio = req->ib_bio; + /* + * We only release the bio(s) here if iblock_bio_done() has not called + * bio_put() -> iblock_bio_destructor(). + */ + while (hbio != NULL) { + bio = hbio; + hbio = hbio->bi_next; + bio->bi_next = NULL; + bio_put(bio); + } + + kfree(req); +} + +enum { + Opt_udev_path, Opt_force, Opt_err +}; + +static match_table_t tokens = { + {Opt_udev_path, "udev_path=%s"}, + {Opt_force, "force=%d"}, + {Opt_err, NULL} +}; + +static ssize_t iblock_set_configfs_dev_params(struct se_hba *hba, + struct se_subsystem_dev *se_dev, + const char *page, ssize_t count) +{ + struct iblock_dev *ib_dev = se_dev->se_dev_su_ptr; + char *orig, *ptr, *opts; + substring_t args[MAX_OPT_ARGS]; + int ret = 0, arg, token; + + opts = kstrdup(page, GFP_KERNEL); + if (!opts) + return -ENOMEM; + + orig = opts; + + while ((ptr = strsep(&opts, ",")) != NULL) { + if (!*ptr) + continue; + + token = match_token(ptr, tokens, args); + switch (token) { + case Opt_udev_path: + if (ib_dev->ibd_bd) { + printk(KERN_ERR "Unable to set udev_path= while" + " ib_dev->ibd_bd exists\n"); + ret = -EEXIST; + goto out; + } + + ret = snprintf(ib_dev->ibd_udev_path, SE_UDEV_PATH_LEN, + "%s", match_strdup(&args[0])); + printk(KERN_INFO "IBLOCK: Referencing UDEV path: %s\n", + ib_dev->ibd_udev_path); + ib_dev->ibd_flags |= IBDF_HAS_UDEV_PATH; + break; + case Opt_force: + match_int(args, &arg); + ib_dev->ibd_force = arg; + printk(KERN_INFO "IBLOCK: Set force=%d\n", + ib_dev->ibd_force); + break; + default: + break; + } + } + +out: + kfree(orig); + return (!ret) ? count : ret; +} + +static ssize_t iblock_check_configfs_dev_params( + struct se_hba *hba, + struct se_subsystem_dev *se_dev) +{ + struct iblock_dev *ibd = se_dev->se_dev_su_ptr; + + if (!(ibd->ibd_flags & IBDF_HAS_UDEV_PATH)) { + printk(KERN_ERR "Missing udev_path= parameters for IBLOCK\n"); + return -1; + } + + return 0; +} + +static ssize_t iblock_show_configfs_dev_params( + struct se_hba *hba, + struct se_subsystem_dev *se_dev, + char *b) +{ + struct iblock_dev *ibd = se_dev->se_dev_su_ptr; + struct block_device *bd = ibd->ibd_bd; + char buf[BDEVNAME_SIZE]; + ssize_t bl = 0; + + if (bd) + bl += sprintf(b + bl, "iBlock device: %s", + bdevname(bd, buf)); + if (ibd->ibd_flags & IBDF_HAS_UDEV_PATH) { + bl += sprintf(b + bl, " UDEV PATH: %s\n", + ibd->ibd_udev_path); + } else + bl += sprintf(b + bl, "\n"); + + bl += sprintf(b + bl, " "); + if (bd) { + bl += sprintf(b + bl, "Major: %d Minor: %d %s\n", + ibd->ibd_major, ibd->ibd_minor, (!bd->bd_contains) ? + "" : (bd->bd_holder == (struct iblock_dev *)ibd) ? + "CLAIMED: IBLOCK" : "CLAIMED: OS"); + } else { + bl += sprintf(b + bl, "Major: %d Minor: %d\n", + ibd->ibd_major, ibd->ibd_minor); + } + + return bl; +} + +static void iblock_bio_destructor(struct bio *bio) +{ + struct se_task *task = bio->bi_private; + struct iblock_dev *ib_dev = task->se_dev->dev_ptr; + + bio_free(bio, ib_dev->ibd_bio_set); +} + +static struct bio *iblock_get_bio( + struct se_task *task, + struct iblock_req *ib_req, + struct iblock_dev *ib_dev, + int *ret, + sector_t lba, + u32 sg_num) +{ + struct bio *bio; + + bio = bio_alloc_bioset(GFP_NOIO, sg_num, ib_dev->ibd_bio_set); + if (!(bio)) { + printk(KERN_ERR "Unable to allocate memory for bio\n"); + *ret = PYX_TRANSPORT_OUT_OF_MEMORY_RESOURCES; + return NULL; + } + + DEBUG_IBLOCK("Allocated bio: %p task_sg_num: %u using ibd_bio_set:" + " %p\n", bio, task->task_sg_num, ib_dev->ibd_bio_set); + DEBUG_IBLOCK("Allocated bio: %p task_size: %u\n", bio, task->task_size); + + bio->bi_bdev = ib_dev->ibd_bd; + bio->bi_private = (void *) task; + bio->bi_destructor = iblock_bio_destructor; + bio->bi_end_io = &iblock_bio_done; + bio->bi_sector = lba; + atomic_inc(&ib_req->ib_bio_cnt); + + DEBUG_IBLOCK("Set bio->bi_sector: %llu\n", bio->bi_sector); + DEBUG_IBLOCK("Set ib_req->ib_bio_cnt: %d\n", + atomic_read(&ib_req->ib_bio_cnt)); + return bio; +} + +static int iblock_map_task_SG(struct se_task *task) +{ + struct se_cmd *cmd = task->task_se_cmd; + struct se_device *dev = SE_DEV(cmd); + struct iblock_dev *ib_dev = task->se_dev->dev_ptr; + struct iblock_req *ib_req = IBLOCK_REQ(task); + struct bio *bio = NULL, *hbio = NULL, *tbio = NULL; + struct scatterlist *sg; + int ret = 0; + u32 i, sg_num = task->task_sg_num; + sector_t block_lba; + /* + * Do starting conversion up from non 512-byte blocksize with + * struct se_task SCSI blocksize into Linux/Block 512 units for BIO. + */ + if (DEV_ATTRIB(dev)->block_size == 4096) + block_lba = (task->task_lba << 3); + else if (DEV_ATTRIB(dev)->block_size == 2048) + block_lba = (task->task_lba << 2); + else if (DEV_ATTRIB(dev)->block_size == 1024) + block_lba = (task->task_lba << 1); + else if (DEV_ATTRIB(dev)->block_size == 512) + block_lba = task->task_lba; + else { + printk(KERN_ERR "Unsupported SCSI -> BLOCK LBA conversion:" + " %u\n", DEV_ATTRIB(dev)->block_size); + return PYX_TRANSPORT_LU_COMM_FAILURE; + } + + bio = iblock_get_bio(task, ib_req, ib_dev, &ret, block_lba, sg_num); + if (!(bio)) + return ret; + + ib_req->ib_bio = bio; + hbio = tbio = bio; + /* + * Use fs/bio.c:bio_add_pages() to setup the bio_vec maplist + * from TCM struct se_mem -> task->task_sg -> struct scatterlist memory. + */ + for_each_sg(task->task_sg, sg, task->task_sg_num, i) { + DEBUG_IBLOCK("task: %p bio: %p Calling bio_add_page(): page:" + " %p len: %u offset: %u\n", task, bio, sg_page(sg), + sg->length, sg->offset); +again: + ret = bio_add_page(bio, sg_page(sg), sg->length, sg->offset); + if (ret != sg->length) { + + DEBUG_IBLOCK("*** Set bio->bi_sector: %llu\n", + bio->bi_sector); + DEBUG_IBLOCK("** task->task_size: %u\n", + task->task_size); + DEBUG_IBLOCK("*** bio->bi_max_vecs: %u\n", + bio->bi_max_vecs); + DEBUG_IBLOCK("*** bio->bi_vcnt: %u\n", + bio->bi_vcnt); + + bio = iblock_get_bio(task, ib_req, ib_dev, &ret, + block_lba, sg_num); + if (!(bio)) + goto fail; + + tbio = tbio->bi_next = bio; + DEBUG_IBLOCK("-----------------> Added +1 bio: %p to" + " list, Going to again\n", bio); + goto again; + } + /* Always in 512 byte units for Linux/Block */ + block_lba += sg->length >> IBLOCK_LBA_SHIFT; + sg_num--; + DEBUG_IBLOCK("task: %p bio-add_page() passed!, decremented" + " sg_num to %u\n", task, sg_num); + DEBUG_IBLOCK("task: %p bio_add_page() passed!, increased lba" + " to %llu\n", task, block_lba); + DEBUG_IBLOCK("task: %p bio_add_page() passed!, bio->bi_vcnt:" + " %u\n", task, bio->bi_vcnt); + } + + return 0; +fail: + while (hbio) { + bio = hbio; + hbio = hbio->bi_next; + bio->bi_next = NULL; + bio_put(bio); + } + return ret; +} + +static unsigned char *iblock_get_cdb(struct se_task *task) +{ + return IBLOCK_REQ(task)->ib_scsi_cdb; +} + +static u32 iblock_get_device_rev(struct se_device *dev) +{ + return SCSI_SPC_2; /* Returns SPC-3 in Initiator Data */ +} + +static u32 iblock_get_device_type(struct se_device *dev) +{ + return TYPE_DISK; +} + +static sector_t iblock_get_blocks(struct se_device *dev) +{ + struct iblock_dev *ibd = dev->dev_ptr; + struct block_device *bd = ibd->ibd_bd; + struct request_queue *q = bdev_get_queue(bd); + + return iblock_emulate_read_cap_with_block_size(dev, bd, q); +} + +static void iblock_bio_done(struct bio *bio, int err) +{ + struct se_task *task = bio->bi_private; + struct iblock_req *ibr = IBLOCK_REQ(task); + /* + * Set -EIO if !BIO_UPTODATE and the passed is still err=0 + */ + if (!(test_bit(BIO_UPTODATE, &bio->bi_flags)) && !(err)) + err = -EIO; + + if (err != 0) { + printk(KERN_ERR "test_bit(BIO_UPTODATE) failed for bio: %p," + " err: %d\n", bio, err); + /* + * Bump the ib_bio_err_cnt and release bio. + */ + atomic_inc(&ibr->ib_bio_err_cnt); + smp_mb__after_atomic_inc(); + bio_put(bio); + /* + * Wait to complete the task until the last bio as completed. + */ + if (!(atomic_dec_and_test(&ibr->ib_bio_cnt))) + return; + + ibr->ib_bio = NULL; + transport_complete_task(task, 0); + return; + } + DEBUG_IBLOCK("done[%p] bio: %p task_lba: %llu bio_lba: %llu err=%d\n", + task, bio, task->task_lba, bio->bi_sector, err); + /* + * bio_put() will call iblock_bio_destructor() to release the bio back + * to ibr->ib_bio_set. + */ + bio_put(bio); + /* + * Wait to complete the task until the last bio as completed. + */ + if (!(atomic_dec_and_test(&ibr->ib_bio_cnt))) + return; + /* + * Return GOOD status for task if zero ib_bio_err_cnt exists. + */ + ibr->ib_bio = NULL; + transport_complete_task(task, (!atomic_read(&ibr->ib_bio_err_cnt))); +} + +static struct se_subsystem_api iblock_template = { + .name = "iblock", + .owner = THIS_MODULE, + .transport_type = TRANSPORT_PLUGIN_VHBA_PDEV, + .map_task_SG = iblock_map_task_SG, + .attach_hba = iblock_attach_hba, + .detach_hba = iblock_detach_hba, + .allocate_virtdevice = iblock_allocate_virtdevice, + .create_virtdevice = iblock_create_virtdevice, + .free_device = iblock_free_device, + .dpo_emulated = iblock_emulated_dpo, + .fua_write_emulated = iblock_emulated_fua_write, + .fua_read_emulated = iblock_emulated_fua_read, + .write_cache_emulated = iblock_emulated_write_cache, + .alloc_task = iblock_alloc_task, + .do_task = iblock_do_task, + .do_discard = iblock_do_discard, + .do_sync_cache = iblock_emulate_sync_cache, + .free_task = iblock_free_task, + .check_configfs_dev_params = iblock_check_configfs_dev_params, + .set_configfs_dev_params = iblock_set_configfs_dev_params, + .show_configfs_dev_params = iblock_show_configfs_dev_params, + .get_cdb = iblock_get_cdb, + .get_device_rev = iblock_get_device_rev, + .get_device_type = iblock_get_device_type, + .get_blocks = iblock_get_blocks, +}; + +static int __init iblock_module_init(void) +{ + return transport_subsystem_register(&iblock_template); +} + +static void iblock_module_exit(void) +{ + transport_subsystem_release(&iblock_template); +} + +MODULE_DESCRIPTION("TCM IBLOCK subsystem plugin"); +MODULE_AUTHOR("nab@Linux-iSCSI.org"); +MODULE_LICENSE("GPL"); + +module_init(iblock_module_init); +module_exit(iblock_module_exit); diff --git a/drivers/target/target_core_iblock.h b/drivers/target/target_core_iblock.h new file mode 100644 index 000000000000..64c1f4d69f76 --- /dev/null +++ b/drivers/target/target_core_iblock.h @@ -0,0 +1,40 @@ +#ifndef TARGET_CORE_IBLOCK_H +#define TARGET_CORE_IBLOCK_H + +#define IBLOCK_VERSION "4.0" + +#define IBLOCK_HBA_QUEUE_DEPTH 512 +#define IBLOCK_DEVICE_QUEUE_DEPTH 32 +#define IBLOCK_MAX_DEVICE_QUEUE_DEPTH 128 +#define IBLOCK_MAX_CDBS 16 +#define IBLOCK_LBA_SHIFT 9 + +struct iblock_req { + struct se_task ib_task; + unsigned char ib_scsi_cdb[TCM_MAX_COMMAND_SIZE]; + atomic_t ib_bio_cnt; + atomic_t ib_bio_err_cnt; + struct bio *ib_bio; + struct iblock_dev *ib_dev; +} ____cacheline_aligned; + +#define IBDF_HAS_UDEV_PATH 0x01 +#define IBDF_HAS_FORCE 0x02 + +struct iblock_dev { + unsigned char ibd_udev_path[SE_UDEV_PATH_LEN]; + int ibd_force; + int ibd_major; + int ibd_minor; + u32 ibd_depth; + u32 ibd_flags; + struct bio_set *ibd_bio_set; + struct block_device *ibd_bd; + struct iblock_hba *ibd_host; +} ____cacheline_aligned; + +struct iblock_hba { + int iblock_host_id; +} ____cacheline_aligned; + +#endif /* TARGET_CORE_IBLOCK_H */ diff --git a/drivers/target/target_core_mib.c b/drivers/target/target_core_mib.c new file mode 100644 index 000000000000..d5a48aa0d2d1 --- /dev/null +++ b/drivers/target/target_core_mib.c @@ -0,0 +1,1078 @@ +/******************************************************************************* + * Filename: target_core_mib.c + * + * Copyright (c) 2006-2007 SBE, Inc. All Rights Reserved. + * Copyright (c) 2007-2010 Rising Tide Systems + * Copyright (c) 2008-2010 Linux-iSCSI.org + * + * Nicholas A. Bellinger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + ******************************************************************************/ + + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include + +#include "target_core_hba.h" +#include "target_core_mib.h" + +/* SCSI mib table index */ +static struct scsi_index_table scsi_index_table; + +#ifndef INITIAL_JIFFIES +#define INITIAL_JIFFIES ((unsigned long)(unsigned int) (-300*HZ)) +#endif + +/* SCSI Instance Table */ +#define SCSI_INST_SW_INDEX 1 +#define SCSI_TRANSPORT_INDEX 1 + +#define NONE "None" +#define ISPRINT(a) ((a >= ' ') && (a <= '~')) + +static inline int list_is_first(const struct list_head *list, + const struct list_head *head) +{ + return list->prev == head; +} + +static void *locate_hba_start( + struct seq_file *seq, + loff_t *pos) +{ + spin_lock(&se_global->g_device_lock); + return seq_list_start(&se_global->g_se_dev_list, *pos); +} + +static void *locate_hba_next( + struct seq_file *seq, + void *v, + loff_t *pos) +{ + return seq_list_next(v, &se_global->g_se_dev_list, pos); +} + +static void locate_hba_stop(struct seq_file *seq, void *v) +{ + spin_unlock(&se_global->g_device_lock); +} + +/**************************************************************************** + * SCSI MIB Tables + ****************************************************************************/ + +/* + * SCSI Instance Table + */ +static void *scsi_inst_seq_start( + struct seq_file *seq, + loff_t *pos) +{ + spin_lock(&se_global->hba_lock); + return seq_list_start(&se_global->g_hba_list, *pos); +} + +static void *scsi_inst_seq_next( + struct seq_file *seq, + void *v, + loff_t *pos) +{ + return seq_list_next(v, &se_global->g_hba_list, pos); +} + +static void scsi_inst_seq_stop(struct seq_file *seq, void *v) +{ + spin_unlock(&se_global->hba_lock); +} + +static int scsi_inst_seq_show(struct seq_file *seq, void *v) +{ + struct se_hba *hba = list_entry(v, struct se_hba, hba_list); + + if (list_is_first(&hba->hba_list, &se_global->g_hba_list)) + seq_puts(seq, "inst sw_indx\n"); + + seq_printf(seq, "%u %u\n", hba->hba_index, SCSI_INST_SW_INDEX); + seq_printf(seq, "plugin: %s version: %s\n", + hba->transport->name, TARGET_CORE_VERSION); + + return 0; +} + +static const struct seq_operations scsi_inst_seq_ops = { + .start = scsi_inst_seq_start, + .next = scsi_inst_seq_next, + .stop = scsi_inst_seq_stop, + .show = scsi_inst_seq_show +}; + +static int scsi_inst_seq_open(struct inode *inode, struct file *file) +{ + return seq_open(file, &scsi_inst_seq_ops); +} + +static const struct file_operations scsi_inst_seq_fops = { + .owner = THIS_MODULE, + .open = scsi_inst_seq_open, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release, +}; + +/* + * SCSI Device Table + */ +static void *scsi_dev_seq_start(struct seq_file *seq, loff_t *pos) +{ + return locate_hba_start(seq, pos); +} + +static void *scsi_dev_seq_next(struct seq_file *seq, void *v, loff_t *pos) +{ + return locate_hba_next(seq, v, pos); +} + +static void scsi_dev_seq_stop(struct seq_file *seq, void *v) +{ + locate_hba_stop(seq, v); +} + +static int scsi_dev_seq_show(struct seq_file *seq, void *v) +{ + struct se_hba *hba; + struct se_subsystem_dev *se_dev = list_entry(v, struct se_subsystem_dev, + g_se_dev_list); + struct se_device *dev = se_dev->se_dev_ptr; + char str[28]; + int k; + + if (list_is_first(&se_dev->g_se_dev_list, &se_global->g_se_dev_list)) + seq_puts(seq, "inst indx role ports\n"); + + if (!(dev)) + return 0; + + hba = dev->se_hba; + if (!(hba)) { + /* Log error ? */ + return 0; + } + + seq_printf(seq, "%u %u %s %u\n", hba->hba_index, + dev->dev_index, "Target", dev->dev_port_count); + + memcpy(&str[0], (void *)DEV_T10_WWN(dev), 28); + + /* vendor */ + for (k = 0; k < 8; k++) + str[k] = ISPRINT(DEV_T10_WWN(dev)->vendor[k]) ? + DEV_T10_WWN(dev)->vendor[k] : 0x20; + str[k] = 0x20; + + /* model */ + for (k = 0; k < 16; k++) + str[k+9] = ISPRINT(DEV_T10_WWN(dev)->model[k]) ? + DEV_T10_WWN(dev)->model[k] : 0x20; + str[k + 9] = 0; + + seq_printf(seq, "dev_alias: %s\n", str); + + return 0; +} + +static const struct seq_operations scsi_dev_seq_ops = { + .start = scsi_dev_seq_start, + .next = scsi_dev_seq_next, + .stop = scsi_dev_seq_stop, + .show = scsi_dev_seq_show +}; + +static int scsi_dev_seq_open(struct inode *inode, struct file *file) +{ + return seq_open(file, &scsi_dev_seq_ops); +} + +static const struct file_operations scsi_dev_seq_fops = { + .owner = THIS_MODULE, + .open = scsi_dev_seq_open, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release, +}; + +/* + * SCSI Port Table + */ +static void *scsi_port_seq_start(struct seq_file *seq, loff_t *pos) +{ + return locate_hba_start(seq, pos); +} + +static void *scsi_port_seq_next(struct seq_file *seq, void *v, loff_t *pos) +{ + return locate_hba_next(seq, v, pos); +} + +static void scsi_port_seq_stop(struct seq_file *seq, void *v) +{ + locate_hba_stop(seq, v); +} + +static int scsi_port_seq_show(struct seq_file *seq, void *v) +{ + struct se_hba *hba; + struct se_subsystem_dev *se_dev = list_entry(v, struct se_subsystem_dev, + g_se_dev_list); + struct se_device *dev = se_dev->se_dev_ptr; + struct se_port *sep, *sep_tmp; + + if (list_is_first(&se_dev->g_se_dev_list, &se_global->g_se_dev_list)) + seq_puts(seq, "inst device indx role busy_count\n"); + + if (!(dev)) + return 0; + + hba = dev->se_hba; + if (!(hba)) { + /* Log error ? */ + return 0; + } + + /* FIXME: scsiPortBusyStatuses count */ + spin_lock(&dev->se_port_lock); + list_for_each_entry_safe(sep, sep_tmp, &dev->dev_sep_list, sep_list) { + seq_printf(seq, "%u %u %u %s%u %u\n", hba->hba_index, + dev->dev_index, sep->sep_index, "Device", + dev->dev_index, 0); + } + spin_unlock(&dev->se_port_lock); + + return 0; +} + +static const struct seq_operations scsi_port_seq_ops = { + .start = scsi_port_seq_start, + .next = scsi_port_seq_next, + .stop = scsi_port_seq_stop, + .show = scsi_port_seq_show +}; + +static int scsi_port_seq_open(struct inode *inode, struct file *file) +{ + return seq_open(file, &scsi_port_seq_ops); +} + +static const struct file_operations scsi_port_seq_fops = { + .owner = THIS_MODULE, + .open = scsi_port_seq_open, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release, +}; + +/* + * SCSI Transport Table + */ +static void *scsi_transport_seq_start(struct seq_file *seq, loff_t *pos) +{ + return locate_hba_start(seq, pos); +} + +static void *scsi_transport_seq_next(struct seq_file *seq, void *v, loff_t *pos) +{ + return locate_hba_next(seq, v, pos); +} + +static void scsi_transport_seq_stop(struct seq_file *seq, void *v) +{ + locate_hba_stop(seq, v); +} + +static int scsi_transport_seq_show(struct seq_file *seq, void *v) +{ + struct se_hba *hba; + struct se_subsystem_dev *se_dev = list_entry(v, struct se_subsystem_dev, + g_se_dev_list); + struct se_device *dev = se_dev->se_dev_ptr; + struct se_port *se, *se_tmp; + struct se_portal_group *tpg; + struct t10_wwn *wwn; + char buf[64]; + + if (list_is_first(&se_dev->g_se_dev_list, &se_global->g_se_dev_list)) + seq_puts(seq, "inst device indx dev_name\n"); + + if (!(dev)) + return 0; + + hba = dev->se_hba; + if (!(hba)) { + /* Log error ? */ + return 0; + } + + wwn = DEV_T10_WWN(dev); + + spin_lock(&dev->se_port_lock); + list_for_each_entry_safe(se, se_tmp, &dev->dev_sep_list, sep_list) { + tpg = se->sep_tpg; + sprintf(buf, "scsiTransport%s", + TPG_TFO(tpg)->get_fabric_name()); + + seq_printf(seq, "%u %s %u %s+%s\n", + hba->hba_index, /* scsiTransportIndex */ + buf, /* scsiTransportType */ + (TPG_TFO(tpg)->tpg_get_inst_index != NULL) ? + TPG_TFO(tpg)->tpg_get_inst_index(tpg) : + 0, + TPG_TFO(tpg)->tpg_get_wwn(tpg), + (strlen(wwn->unit_serial)) ? + /* scsiTransportDevName */ + wwn->unit_serial : wwn->vendor); + } + spin_unlock(&dev->se_port_lock); + + return 0; +} + +static const struct seq_operations scsi_transport_seq_ops = { + .start = scsi_transport_seq_start, + .next = scsi_transport_seq_next, + .stop = scsi_transport_seq_stop, + .show = scsi_transport_seq_show +}; + +static int scsi_transport_seq_open(struct inode *inode, struct file *file) +{ + return seq_open(file, &scsi_transport_seq_ops); +} + +static const struct file_operations scsi_transport_seq_fops = { + .owner = THIS_MODULE, + .open = scsi_transport_seq_open, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release, +}; + +/* + * SCSI Target Device Table + */ +static void *scsi_tgt_dev_seq_start(struct seq_file *seq, loff_t *pos) +{ + return locate_hba_start(seq, pos); +} + +static void *scsi_tgt_dev_seq_next(struct seq_file *seq, void *v, loff_t *pos) +{ + return locate_hba_next(seq, v, pos); +} + +static void scsi_tgt_dev_seq_stop(struct seq_file *seq, void *v) +{ + locate_hba_stop(seq, v); +} + + +#define LU_COUNT 1 /* for now */ +static int scsi_tgt_dev_seq_show(struct seq_file *seq, void *v) +{ + struct se_hba *hba; + struct se_subsystem_dev *se_dev = list_entry(v, struct se_subsystem_dev, + g_se_dev_list); + struct se_device *dev = se_dev->se_dev_ptr; + int non_accessible_lus = 0; + char status[16]; + + if (list_is_first(&se_dev->g_se_dev_list, &se_global->g_se_dev_list)) + seq_puts(seq, "inst indx num_LUs status non_access_LUs" + " resets\n"); + + if (!(dev)) + return 0; + + hba = dev->se_hba; + if (!(hba)) { + /* Log error ? */ + return 0; + } + + switch (dev->dev_status) { + case TRANSPORT_DEVICE_ACTIVATED: + strcpy(status, "activated"); + break; + case TRANSPORT_DEVICE_DEACTIVATED: + strcpy(status, "deactivated"); + non_accessible_lus = 1; + break; + case TRANSPORT_DEVICE_SHUTDOWN: + strcpy(status, "shutdown"); + non_accessible_lus = 1; + break; + case TRANSPORT_DEVICE_OFFLINE_ACTIVATED: + case TRANSPORT_DEVICE_OFFLINE_DEACTIVATED: + strcpy(status, "offline"); + non_accessible_lus = 1; + break; + default: + sprintf(status, "unknown(%d)", dev->dev_status); + non_accessible_lus = 1; + } + + seq_printf(seq, "%u %u %u %s %u %u\n", + hba->hba_index, dev->dev_index, LU_COUNT, + status, non_accessible_lus, dev->num_resets); + + return 0; +} + +static const struct seq_operations scsi_tgt_dev_seq_ops = { + .start = scsi_tgt_dev_seq_start, + .next = scsi_tgt_dev_seq_next, + .stop = scsi_tgt_dev_seq_stop, + .show = scsi_tgt_dev_seq_show +}; + +static int scsi_tgt_dev_seq_open(struct inode *inode, struct file *file) +{ + return seq_open(file, &scsi_tgt_dev_seq_ops); +} + +static const struct file_operations scsi_tgt_dev_seq_fops = { + .owner = THIS_MODULE, + .open = scsi_tgt_dev_seq_open, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release, +}; + +/* + * SCSI Target Port Table + */ +static void *scsi_tgt_port_seq_start(struct seq_file *seq, loff_t *pos) +{ + return locate_hba_start(seq, pos); +} + +static void *scsi_tgt_port_seq_next(struct seq_file *seq, void *v, loff_t *pos) +{ + return locate_hba_next(seq, v, pos); +} + +static void scsi_tgt_port_seq_stop(struct seq_file *seq, void *v) +{ + locate_hba_stop(seq, v); +} + +static int scsi_tgt_port_seq_show(struct seq_file *seq, void *v) +{ + struct se_hba *hba; + struct se_subsystem_dev *se_dev = list_entry(v, struct se_subsystem_dev, + g_se_dev_list); + struct se_device *dev = se_dev->se_dev_ptr; + struct se_port *sep, *sep_tmp; + struct se_portal_group *tpg; + u32 rx_mbytes, tx_mbytes; + unsigned long long num_cmds; + char buf[64]; + + if (list_is_first(&se_dev->g_se_dev_list, &se_global->g_se_dev_list)) + seq_puts(seq, "inst device indx name port_index in_cmds" + " write_mbytes read_mbytes hs_in_cmds\n"); + + if (!(dev)) + return 0; + + hba = dev->se_hba; + if (!(hba)) { + /* Log error ? */ + return 0; + } + + spin_lock(&dev->se_port_lock); + list_for_each_entry_safe(sep, sep_tmp, &dev->dev_sep_list, sep_list) { + tpg = sep->sep_tpg; + sprintf(buf, "%sPort#", + TPG_TFO(tpg)->get_fabric_name()); + + seq_printf(seq, "%u %u %u %s%d %s%s%d ", + hba->hba_index, + dev->dev_index, + sep->sep_index, + buf, sep->sep_index, + TPG_TFO(tpg)->tpg_get_wwn(tpg), "+t+", + TPG_TFO(tpg)->tpg_get_tag(tpg)); + + spin_lock(&sep->sep_lun->lun_sep_lock); + num_cmds = sep->sep_stats.cmd_pdus; + rx_mbytes = (sep->sep_stats.rx_data_octets >> 20); + tx_mbytes = (sep->sep_stats.tx_data_octets >> 20); + spin_unlock(&sep->sep_lun->lun_sep_lock); + + seq_printf(seq, "%llu %u %u %u\n", num_cmds, + rx_mbytes, tx_mbytes, 0); + } + spin_unlock(&dev->se_port_lock); + + return 0; +} + +static const struct seq_operations scsi_tgt_port_seq_ops = { + .start = scsi_tgt_port_seq_start, + .next = scsi_tgt_port_seq_next, + .stop = scsi_tgt_port_seq_stop, + .show = scsi_tgt_port_seq_show +}; + +static int scsi_tgt_port_seq_open(struct inode *inode, struct file *file) +{ + return seq_open(file, &scsi_tgt_port_seq_ops); +} + +static const struct file_operations scsi_tgt_port_seq_fops = { + .owner = THIS_MODULE, + .open = scsi_tgt_port_seq_open, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release, +}; + +/* + * SCSI Authorized Initiator Table: + * It contains the SCSI Initiators authorized to be attached to one of the + * local Target ports. + * Iterates through all active TPGs and extracts the info from the ACLs + */ +static void *scsi_auth_intr_seq_start(struct seq_file *seq, loff_t *pos) +{ + spin_lock_bh(&se_global->se_tpg_lock); + return seq_list_start(&se_global->g_se_tpg_list, *pos); +} + +static void *scsi_auth_intr_seq_next(struct seq_file *seq, void *v, + loff_t *pos) +{ + return seq_list_next(v, &se_global->g_se_tpg_list, pos); +} + +static void scsi_auth_intr_seq_stop(struct seq_file *seq, void *v) +{ + spin_unlock_bh(&se_global->se_tpg_lock); +} + +static int scsi_auth_intr_seq_show(struct seq_file *seq, void *v) +{ + struct se_portal_group *se_tpg = list_entry(v, struct se_portal_group, + se_tpg_list); + struct se_dev_entry *deve; + struct se_lun *lun; + struct se_node_acl *se_nacl; + int j; + + if (list_is_first(&se_tpg->se_tpg_list, + &se_global->g_se_tpg_list)) + seq_puts(seq, "inst dev port indx dev_or_port intr_name " + "map_indx att_count num_cmds read_mbytes " + "write_mbytes hs_num_cmds creation_time row_status\n"); + + if (!(se_tpg)) + return 0; + + spin_lock(&se_tpg->acl_node_lock); + list_for_each_entry(se_nacl, &se_tpg->acl_node_list, acl_list) { + + atomic_inc(&se_nacl->mib_ref_count); + smp_mb__after_atomic_inc(); + spin_unlock(&se_tpg->acl_node_lock); + + spin_lock_irq(&se_nacl->device_list_lock); + for (j = 0; j < TRANSPORT_MAX_LUNS_PER_TPG; j++) { + deve = &se_nacl->device_list[j]; + if (!(deve->lun_flags & + TRANSPORT_LUNFLAGS_INITIATOR_ACCESS) || + (!deve->se_lun)) + continue; + lun = deve->se_lun; + if (!lun->lun_se_dev) + continue; + + seq_printf(seq, "%u %u %u %u %u %s %u %u %u %u %u %u" + " %u %s\n", + /* scsiInstIndex */ + (TPG_TFO(se_tpg)->tpg_get_inst_index != NULL) ? + TPG_TFO(se_tpg)->tpg_get_inst_index(se_tpg) : + 0, + /* scsiDeviceIndex */ + lun->lun_se_dev->dev_index, + /* scsiAuthIntrTgtPortIndex */ + TPG_TFO(se_tpg)->tpg_get_tag(se_tpg), + /* scsiAuthIntrIndex */ + se_nacl->acl_index, + /* scsiAuthIntrDevOrPort */ + 1, + /* scsiAuthIntrName */ + se_nacl->initiatorname[0] ? + se_nacl->initiatorname : NONE, + /* FIXME: scsiAuthIntrLunMapIndex */ + 0, + /* scsiAuthIntrAttachedTimes */ + deve->attach_count, + /* scsiAuthIntrOutCommands */ + deve->total_cmds, + /* scsiAuthIntrReadMegaBytes */ + (u32)(deve->read_bytes >> 20), + /* scsiAuthIntrWrittenMegaBytes */ + (u32)(deve->write_bytes >> 20), + /* FIXME: scsiAuthIntrHSOutCommands */ + 0, + /* scsiAuthIntrLastCreation */ + (u32)(((u32)deve->creation_time - + INITIAL_JIFFIES) * 100 / HZ), + /* FIXME: scsiAuthIntrRowStatus */ + "Ready"); + } + spin_unlock_irq(&se_nacl->device_list_lock); + + spin_lock(&se_tpg->acl_node_lock); + atomic_dec(&se_nacl->mib_ref_count); + smp_mb__after_atomic_dec(); + } + spin_unlock(&se_tpg->acl_node_lock); + + return 0; +} + +static const struct seq_operations scsi_auth_intr_seq_ops = { + .start = scsi_auth_intr_seq_start, + .next = scsi_auth_intr_seq_next, + .stop = scsi_auth_intr_seq_stop, + .show = scsi_auth_intr_seq_show +}; + +static int scsi_auth_intr_seq_open(struct inode *inode, struct file *file) +{ + return seq_open(file, &scsi_auth_intr_seq_ops); +} + +static const struct file_operations scsi_auth_intr_seq_fops = { + .owner = THIS_MODULE, + .open = scsi_auth_intr_seq_open, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release, +}; + +/* + * SCSI Attached Initiator Port Table: + * It lists the SCSI Initiators attached to one of the local Target ports. + * Iterates through all active TPGs and use active sessions from each TPG + * to list the info fo this table. + */ +static void *scsi_att_intr_port_seq_start(struct seq_file *seq, loff_t *pos) +{ + spin_lock_bh(&se_global->se_tpg_lock); + return seq_list_start(&se_global->g_se_tpg_list, *pos); +} + +static void *scsi_att_intr_port_seq_next(struct seq_file *seq, void *v, + loff_t *pos) +{ + return seq_list_next(v, &se_global->g_se_tpg_list, pos); +} + +static void scsi_att_intr_port_seq_stop(struct seq_file *seq, void *v) +{ + spin_unlock_bh(&se_global->se_tpg_lock); +} + +static int scsi_att_intr_port_seq_show(struct seq_file *seq, void *v) +{ + struct se_portal_group *se_tpg = list_entry(v, struct se_portal_group, + se_tpg_list); + struct se_dev_entry *deve; + struct se_lun *lun; + struct se_node_acl *se_nacl; + struct se_session *se_sess; + unsigned char buf[64]; + int j; + + if (list_is_first(&se_tpg->se_tpg_list, + &se_global->g_se_tpg_list)) + seq_puts(seq, "inst dev port indx port_auth_indx port_name" + " port_ident\n"); + + if (!(se_tpg)) + return 0; + + spin_lock(&se_tpg->session_lock); + list_for_each_entry(se_sess, &se_tpg->tpg_sess_list, sess_list) { + if ((TPG_TFO(se_tpg)->sess_logged_in(se_sess)) || + (!se_sess->se_node_acl) || + (!se_sess->se_node_acl->device_list)) + continue; + + atomic_inc(&se_sess->mib_ref_count); + smp_mb__after_atomic_inc(); + se_nacl = se_sess->se_node_acl; + atomic_inc(&se_nacl->mib_ref_count); + smp_mb__after_atomic_inc(); + spin_unlock(&se_tpg->session_lock); + + spin_lock_irq(&se_nacl->device_list_lock); + for (j = 0; j < TRANSPORT_MAX_LUNS_PER_TPG; j++) { + deve = &se_nacl->device_list[j]; + if (!(deve->lun_flags & + TRANSPORT_LUNFLAGS_INITIATOR_ACCESS) || + (!deve->se_lun)) + continue; + + lun = deve->se_lun; + if (!lun->lun_se_dev) + continue; + + memset(buf, 0, 64); + if (TPG_TFO(se_tpg)->sess_get_initiator_sid != NULL) + TPG_TFO(se_tpg)->sess_get_initiator_sid( + se_sess, (unsigned char *)&buf[0], 64); + + seq_printf(seq, "%u %u %u %u %u %s+i+%s\n", + /* scsiInstIndex */ + (TPG_TFO(se_tpg)->tpg_get_inst_index != NULL) ? + TPG_TFO(se_tpg)->tpg_get_inst_index(se_tpg) : + 0, + /* scsiDeviceIndex */ + lun->lun_se_dev->dev_index, + /* scsiPortIndex */ + TPG_TFO(se_tpg)->tpg_get_tag(se_tpg), + /* scsiAttIntrPortIndex */ + (TPG_TFO(se_tpg)->sess_get_index != NULL) ? + TPG_TFO(se_tpg)->sess_get_index(se_sess) : + 0, + /* scsiAttIntrPortAuthIntrIdx */ + se_nacl->acl_index, + /* scsiAttIntrPortName */ + se_nacl->initiatorname[0] ? + se_nacl->initiatorname : NONE, + /* scsiAttIntrPortIdentifier */ + buf); + } + spin_unlock_irq(&se_nacl->device_list_lock); + + spin_lock(&se_tpg->session_lock); + atomic_dec(&se_nacl->mib_ref_count); + smp_mb__after_atomic_dec(); + atomic_dec(&se_sess->mib_ref_count); + smp_mb__after_atomic_dec(); + } + spin_unlock(&se_tpg->session_lock); + + return 0; +} + +static const struct seq_operations scsi_att_intr_port_seq_ops = { + .start = scsi_att_intr_port_seq_start, + .next = scsi_att_intr_port_seq_next, + .stop = scsi_att_intr_port_seq_stop, + .show = scsi_att_intr_port_seq_show +}; + +static int scsi_att_intr_port_seq_open(struct inode *inode, struct file *file) +{ + return seq_open(file, &scsi_att_intr_port_seq_ops); +} + +static const struct file_operations scsi_att_intr_port_seq_fops = { + .owner = THIS_MODULE, + .open = scsi_att_intr_port_seq_open, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release, +}; + +/* + * SCSI Logical Unit Table + */ +static void *scsi_lu_seq_start(struct seq_file *seq, loff_t *pos) +{ + return locate_hba_start(seq, pos); +} + +static void *scsi_lu_seq_next(struct seq_file *seq, void *v, loff_t *pos) +{ + return locate_hba_next(seq, v, pos); +} + +static void scsi_lu_seq_stop(struct seq_file *seq, void *v) +{ + locate_hba_stop(seq, v); +} + +#define SCSI_LU_INDEX 1 +static int scsi_lu_seq_show(struct seq_file *seq, void *v) +{ + struct se_hba *hba; + struct se_subsystem_dev *se_dev = list_entry(v, struct se_subsystem_dev, + g_se_dev_list); + struct se_device *dev = se_dev->se_dev_ptr; + int j; + char str[28]; + + if (list_is_first(&se_dev->g_se_dev_list, &se_global->g_se_dev_list)) + seq_puts(seq, "inst dev indx LUN lu_name vend prod rev" + " dev_type status state-bit num_cmds read_mbytes" + " write_mbytes resets full_stat hs_num_cmds creation_time\n"); + + if (!(dev)) + return 0; + + hba = dev->se_hba; + if (!(hba)) { + /* Log error ? */ + return 0; + } + + /* Fix LU state, if we can read it from the device */ + seq_printf(seq, "%u %u %u %llu %s", hba->hba_index, + dev->dev_index, SCSI_LU_INDEX, + (unsigned long long)0, /* FIXME: scsiLuDefaultLun */ + (strlen(DEV_T10_WWN(dev)->unit_serial)) ? + /* scsiLuWwnName */ + (char *)&DEV_T10_WWN(dev)->unit_serial[0] : + "None"); + + memcpy(&str[0], (void *)DEV_T10_WWN(dev), 28); + /* scsiLuVendorId */ + for (j = 0; j < 8; j++) + str[j] = ISPRINT(DEV_T10_WWN(dev)->vendor[j]) ? + DEV_T10_WWN(dev)->vendor[j] : 0x20; + str[8] = 0; + seq_printf(seq, " %s", str); + + /* scsiLuProductId */ + for (j = 0; j < 16; j++) + str[j] = ISPRINT(DEV_T10_WWN(dev)->model[j]) ? + DEV_T10_WWN(dev)->model[j] : 0x20; + str[16] = 0; + seq_printf(seq, " %s", str); + + /* scsiLuRevisionId */ + for (j = 0; j < 4; j++) + str[j] = ISPRINT(DEV_T10_WWN(dev)->revision[j]) ? + DEV_T10_WWN(dev)->revision[j] : 0x20; + str[4] = 0; + seq_printf(seq, " %s", str); + + seq_printf(seq, " %u %s %s %llu %u %u %u %u %u %u\n", + /* scsiLuPeripheralType */ + TRANSPORT(dev)->get_device_type(dev), + (dev->dev_status == TRANSPORT_DEVICE_ACTIVATED) ? + "available" : "notavailable", /* scsiLuStatus */ + "exposed", /* scsiLuState */ + (unsigned long long)dev->num_cmds, + /* scsiLuReadMegaBytes */ + (u32)(dev->read_bytes >> 20), + /* scsiLuWrittenMegaBytes */ + (u32)(dev->write_bytes >> 20), + dev->num_resets, /* scsiLuInResets */ + 0, /* scsiLuOutTaskSetFullStatus */ + 0, /* scsiLuHSInCommands */ + (u32)(((u32)dev->creation_time - INITIAL_JIFFIES) * + 100 / HZ)); + + return 0; +} + +static const struct seq_operations scsi_lu_seq_ops = { + .start = scsi_lu_seq_start, + .next = scsi_lu_seq_next, + .stop = scsi_lu_seq_stop, + .show = scsi_lu_seq_show +}; + +static int scsi_lu_seq_open(struct inode *inode, struct file *file) +{ + return seq_open(file, &scsi_lu_seq_ops); +} + +static const struct file_operations scsi_lu_seq_fops = { + .owner = THIS_MODULE, + .open = scsi_lu_seq_open, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release, +}; + +/****************************************************************************/ + +/* + * Remove proc fs entries + */ +void remove_scsi_target_mib(void) +{ + remove_proc_entry("scsi_target/mib/scsi_inst", NULL); + remove_proc_entry("scsi_target/mib/scsi_dev", NULL); + remove_proc_entry("scsi_target/mib/scsi_port", NULL); + remove_proc_entry("scsi_target/mib/scsi_transport", NULL); + remove_proc_entry("scsi_target/mib/scsi_tgt_dev", NULL); + remove_proc_entry("scsi_target/mib/scsi_tgt_port", NULL); + remove_proc_entry("scsi_target/mib/scsi_auth_intr", NULL); + remove_proc_entry("scsi_target/mib/scsi_att_intr_port", NULL); + remove_proc_entry("scsi_target/mib/scsi_lu", NULL); + remove_proc_entry("scsi_target/mib", NULL); +} + +/* + * Create proc fs entries for the mib tables + */ +int init_scsi_target_mib(void) +{ + struct proc_dir_entry *dir_entry; + struct proc_dir_entry *scsi_inst_entry; + struct proc_dir_entry *scsi_dev_entry; + struct proc_dir_entry *scsi_port_entry; + struct proc_dir_entry *scsi_transport_entry; + struct proc_dir_entry *scsi_tgt_dev_entry; + struct proc_dir_entry *scsi_tgt_port_entry; + struct proc_dir_entry *scsi_auth_intr_entry; + struct proc_dir_entry *scsi_att_intr_port_entry; + struct proc_dir_entry *scsi_lu_entry; + + dir_entry = proc_mkdir("scsi_target/mib", NULL); + if (!(dir_entry)) { + printk(KERN_ERR "proc_mkdir() failed.\n"); + return -1; + } + + scsi_inst_entry = + create_proc_entry("scsi_target/mib/scsi_inst", 0, NULL); + if (scsi_inst_entry) + scsi_inst_entry->proc_fops = &scsi_inst_seq_fops; + else + goto error; + + scsi_dev_entry = + create_proc_entry("scsi_target/mib/scsi_dev", 0, NULL); + if (scsi_dev_entry) + scsi_dev_entry->proc_fops = &scsi_dev_seq_fops; + else + goto error; + + scsi_port_entry = + create_proc_entry("scsi_target/mib/scsi_port", 0, NULL); + if (scsi_port_entry) + scsi_port_entry->proc_fops = &scsi_port_seq_fops; + else + goto error; + + scsi_transport_entry = + create_proc_entry("scsi_target/mib/scsi_transport", 0, NULL); + if (scsi_transport_entry) + scsi_transport_entry->proc_fops = &scsi_transport_seq_fops; + else + goto error; + + scsi_tgt_dev_entry = + create_proc_entry("scsi_target/mib/scsi_tgt_dev", 0, NULL); + if (scsi_tgt_dev_entry) + scsi_tgt_dev_entry->proc_fops = &scsi_tgt_dev_seq_fops; + else + goto error; + + scsi_tgt_port_entry = + create_proc_entry("scsi_target/mib/scsi_tgt_port", 0, NULL); + if (scsi_tgt_port_entry) + scsi_tgt_port_entry->proc_fops = &scsi_tgt_port_seq_fops; + else + goto error; + + scsi_auth_intr_entry = + create_proc_entry("scsi_target/mib/scsi_auth_intr", 0, NULL); + if (scsi_auth_intr_entry) + scsi_auth_intr_entry->proc_fops = &scsi_auth_intr_seq_fops; + else + goto error; + + scsi_att_intr_port_entry = + create_proc_entry("scsi_target/mib/scsi_att_intr_port", 0, NULL); + if (scsi_att_intr_port_entry) + scsi_att_intr_port_entry->proc_fops = + &scsi_att_intr_port_seq_fops; + else + goto error; + + scsi_lu_entry = create_proc_entry("scsi_target/mib/scsi_lu", 0, NULL); + if (scsi_lu_entry) + scsi_lu_entry->proc_fops = &scsi_lu_seq_fops; + else + goto error; + + return 0; + +error: + printk(KERN_ERR "create_proc_entry() failed.\n"); + remove_scsi_target_mib(); + return -1; +} + +/* + * Initialize the index table for allocating unique row indexes to various mib + * tables + */ +void init_scsi_index_table(void) +{ + memset(&scsi_index_table, 0, sizeof(struct scsi_index_table)); + spin_lock_init(&scsi_index_table.lock); +} + +/* + * Allocate a new row index for the entry type specified + */ +u32 scsi_get_new_index(scsi_index_t type) +{ + u32 new_index; + + if ((type < 0) || (type >= SCSI_INDEX_TYPE_MAX)) { + printk(KERN_ERR "Invalid index type %d\n", type); + return -1; + } + + spin_lock(&scsi_index_table.lock); + new_index = ++scsi_index_table.scsi_mib_index[type]; + if (new_index == 0) + new_index = ++scsi_index_table.scsi_mib_index[type]; + spin_unlock(&scsi_index_table.lock); + + return new_index; +} +EXPORT_SYMBOL(scsi_get_new_index); diff --git a/drivers/target/target_core_mib.h b/drivers/target/target_core_mib.h new file mode 100644 index 000000000000..277204633850 --- /dev/null +++ b/drivers/target/target_core_mib.h @@ -0,0 +1,28 @@ +#ifndef TARGET_CORE_MIB_H +#define TARGET_CORE_MIB_H + +typedef enum { + SCSI_INST_INDEX, + SCSI_DEVICE_INDEX, + SCSI_AUTH_INTR_INDEX, + SCSI_INDEX_TYPE_MAX +} scsi_index_t; + +struct scsi_index_table { + spinlock_t lock; + u32 scsi_mib_index[SCSI_INDEX_TYPE_MAX]; +} ____cacheline_aligned; + +/* SCSI Port stats */ +struct scsi_port_stats { + u64 cmd_pdus; + u64 tx_data_octets; + u64 rx_data_octets; +} ____cacheline_aligned; + +extern int init_scsi_target_mib(void); +extern void remove_scsi_target_mib(void); +extern void init_scsi_index_table(void); +extern u32 scsi_get_new_index(scsi_index_t); + +#endif /*** TARGET_CORE_MIB_H ***/ diff --git a/drivers/target/target_core_pr.c b/drivers/target/target_core_pr.c new file mode 100644 index 000000000000..2521f75362c3 --- /dev/null +++ b/drivers/target/target_core_pr.c @@ -0,0 +1,4252 @@ +/******************************************************************************* + * Filename: target_core_pr.c + * + * This file contains SPC-3 compliant persistent reservations and + * legacy SPC-2 reservations with compatible reservation handling (CRH=1) + * + * Copyright (c) 2009, 2010 Rising Tide Systems + * Copyright (c) 2009, 2010 Linux-iSCSI.org + * + * Nicholas A. Bellinger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + ******************************************************************************/ + +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include + +#include "target_core_hba.h" +#include "target_core_pr.h" +#include "target_core_ua.h" + +/* + * Used for Specify Initiator Ports Capable Bit (SPEC_I_PT) + */ +struct pr_transport_id_holder { + int dest_local_nexus; + struct t10_pr_registration *dest_pr_reg; + struct se_portal_group *dest_tpg; + struct se_node_acl *dest_node_acl; + struct se_dev_entry *dest_se_deve; + struct list_head dest_list; +}; + +int core_pr_dump_initiator_port( + struct t10_pr_registration *pr_reg, + char *buf, + u32 size) +{ + if (!(pr_reg->isid_present_at_reg)) + return 0; + + snprintf(buf, size, ",i,0x%s", &pr_reg->pr_reg_isid[0]); + return 1; +} + +static void __core_scsi3_complete_pro_release(struct se_device *, struct se_node_acl *, + struct t10_pr_registration *, int); + +static int core_scsi2_reservation_seq_non_holder( + struct se_cmd *cmd, + unsigned char *cdb, + u32 pr_reg_type) +{ + switch (cdb[0]) { + case INQUIRY: + case RELEASE: + case RELEASE_10: + return 0; + default: + return 1; + } + + return 1; +} + +static int core_scsi2_reservation_check(struct se_cmd *cmd, u32 *pr_reg_type) +{ + struct se_device *dev = cmd->se_dev; + struct se_session *sess = cmd->se_sess; + int ret; + + if (!(sess)) + return 0; + + spin_lock(&dev->dev_reservation_lock); + if (!dev->dev_reserved_node_acl || !sess) { + spin_unlock(&dev->dev_reservation_lock); + return 0; + } + if (dev->dev_reserved_node_acl != sess->se_node_acl) { + spin_unlock(&dev->dev_reservation_lock); + return -1; + } + if (!(dev->dev_flags & DF_SPC2_RESERVATIONS_WITH_ISID)) { + spin_unlock(&dev->dev_reservation_lock); + return 0; + } + ret = (dev->dev_res_bin_isid == sess->sess_bin_isid) ? 0 : -1; + spin_unlock(&dev->dev_reservation_lock); + + return ret; +} + +static int core_scsi2_reservation_release(struct se_cmd *cmd) +{ + struct se_device *dev = cmd->se_dev; + struct se_session *sess = cmd->se_sess; + struct se_portal_group *tpg = sess->se_tpg; + + if (!(sess) || !(tpg)) + return 0; + + spin_lock(&dev->dev_reservation_lock); + if (!dev->dev_reserved_node_acl || !sess) { + spin_unlock(&dev->dev_reservation_lock); + return 0; + } + + if (dev->dev_reserved_node_acl != sess->se_node_acl) { + spin_unlock(&dev->dev_reservation_lock); + return 0; + } + dev->dev_reserved_node_acl = NULL; + dev->dev_flags &= ~DF_SPC2_RESERVATIONS; + if (dev->dev_flags & DF_SPC2_RESERVATIONS_WITH_ISID) { + dev->dev_res_bin_isid = 0; + dev->dev_flags &= ~DF_SPC2_RESERVATIONS_WITH_ISID; + } + printk(KERN_INFO "SCSI-2 Released reservation for %s LUN: %u ->" + " MAPPED LUN: %u for %s\n", TPG_TFO(tpg)->get_fabric_name(), + SE_LUN(cmd)->unpacked_lun, cmd->se_deve->mapped_lun, + sess->se_node_acl->initiatorname); + spin_unlock(&dev->dev_reservation_lock); + + return 0; +} + +static int core_scsi2_reservation_reserve(struct se_cmd *cmd) +{ + struct se_device *dev = cmd->se_dev; + struct se_session *sess = cmd->se_sess; + struct se_portal_group *tpg = sess->se_tpg; + + if ((T_TASK(cmd)->t_task_cdb[1] & 0x01) && + (T_TASK(cmd)->t_task_cdb[1] & 0x02)) { + printk(KERN_ERR "LongIO and Obselete Bits set, returning" + " ILLEGAL_REQUEST\n"); + return PYX_TRANSPORT_ILLEGAL_REQUEST; + } + /* + * This is currently the case for target_core_mod passthrough struct se_cmd + * ops + */ + if (!(sess) || !(tpg)) + return 0; + + spin_lock(&dev->dev_reservation_lock); + if (dev->dev_reserved_node_acl && + (dev->dev_reserved_node_acl != sess->se_node_acl)) { + printk(KERN_ERR "SCSI-2 RESERVATION CONFLIFT for %s fabric\n", + TPG_TFO(tpg)->get_fabric_name()); + printk(KERN_ERR "Original reserver LUN: %u %s\n", + SE_LUN(cmd)->unpacked_lun, + dev->dev_reserved_node_acl->initiatorname); + printk(KERN_ERR "Current attempt - LUN: %u -> MAPPED LUN: %u" + " from %s \n", SE_LUN(cmd)->unpacked_lun, + cmd->se_deve->mapped_lun, + sess->se_node_acl->initiatorname); + spin_unlock(&dev->dev_reservation_lock); + return PYX_TRANSPORT_RESERVATION_CONFLICT; + } + + dev->dev_reserved_node_acl = sess->se_node_acl; + dev->dev_flags |= DF_SPC2_RESERVATIONS; + if (sess->sess_bin_isid != 0) { + dev->dev_res_bin_isid = sess->sess_bin_isid; + dev->dev_flags |= DF_SPC2_RESERVATIONS_WITH_ISID; + } + printk(KERN_INFO "SCSI-2 Reserved %s LUN: %u -> MAPPED LUN: %u" + " for %s\n", TPG_TFO(tpg)->get_fabric_name(), + SE_LUN(cmd)->unpacked_lun, cmd->se_deve->mapped_lun, + sess->se_node_acl->initiatorname); + spin_unlock(&dev->dev_reservation_lock); + + return 0; +} + +static struct t10_pr_registration *core_scsi3_locate_pr_reg(struct se_device *, + struct se_node_acl *, struct se_session *); +static void core_scsi3_put_pr_reg(struct t10_pr_registration *); + +/* + * Setup in target_core_transport.c:transport_generic_cmd_sequencer() + * and called via struct se_cmd->transport_emulate_cdb() in TCM processing + * thread context. + */ +int core_scsi2_emulate_crh(struct se_cmd *cmd) +{ + struct se_session *se_sess = cmd->se_sess; + struct se_subsystem_dev *su_dev = cmd->se_dev->se_sub_dev; + struct t10_pr_registration *pr_reg; + struct t10_reservation_template *pr_tmpl = &su_dev->t10_reservation; + unsigned char *cdb = &T_TASK(cmd)->t_task_cdb[0]; + int crh = (T10_RES(su_dev)->res_type == SPC3_PERSISTENT_RESERVATIONS); + int conflict = 0; + + if (!(se_sess)) + return 0; + + if (!(crh)) + goto after_crh; + + pr_reg = core_scsi3_locate_pr_reg(cmd->se_dev, se_sess->se_node_acl, + se_sess); + if (pr_reg) { + /* + * From spc4r17 5.7.3 Exceptions to SPC-2 RESERVE and RELEASE + * behavior + * + * A RESERVE(6) or RESERVE(10) command shall complete with GOOD + * status, but no reservation shall be established and the + * persistent reservation shall not be changed, if the command + * is received from a) and b) below. + * + * A RELEASE(6) or RELEASE(10) command shall complete with GOOD + * status, but the persistent reservation shall not be released, + * if the command is received from a) and b) + * + * a) An I_T nexus that is a persistent reservation holder; or + * b) An I_T nexus that is registered if a registrants only or + * all registrants type persistent reservation is present. + * + * In all other cases, a RESERVE(6) command, RESERVE(10) command, + * RELEASE(6) command, or RELEASE(10) command shall be processed + * as defined in SPC-2. + */ + if (pr_reg->pr_res_holder) { + core_scsi3_put_pr_reg(pr_reg); + return 0; + } + if ((pr_reg->pr_res_type == PR_TYPE_WRITE_EXCLUSIVE_REGONLY) || + (pr_reg->pr_res_type == PR_TYPE_EXCLUSIVE_ACCESS_REGONLY) || + (pr_reg->pr_res_type == PR_TYPE_WRITE_EXCLUSIVE_ALLREG) || + (pr_reg->pr_res_type == PR_TYPE_EXCLUSIVE_ACCESS_ALLREG)) { + core_scsi3_put_pr_reg(pr_reg); + return 0; + } + core_scsi3_put_pr_reg(pr_reg); + conflict = 1; + } else { + /* + * Following spc2r20 5.5.1 Reservations overview: + * + * If a logical unit has executed a PERSISTENT RESERVE OUT + * command with the REGISTER or the REGISTER AND IGNORE + * EXISTING KEY service action and is still registered by any + * initiator, all RESERVE commands and all RELEASE commands + * regardless of initiator shall conflict and shall terminate + * with a RESERVATION CONFLICT status. + */ + spin_lock(&pr_tmpl->registration_lock); + conflict = (list_empty(&pr_tmpl->registration_list)) ? 0 : 1; + spin_unlock(&pr_tmpl->registration_lock); + } + + if (conflict) { + printk(KERN_ERR "Received legacy SPC-2 RESERVE/RELEASE" + " while active SPC-3 registrations exist," + " returning RESERVATION_CONFLICT\n"); + return PYX_TRANSPORT_RESERVATION_CONFLICT; + } + +after_crh: + if ((cdb[0] == RESERVE) || (cdb[0] == RESERVE_10)) + return core_scsi2_reservation_reserve(cmd); + else if ((cdb[0] == RELEASE) || (cdb[0] == RELEASE_10)) + return core_scsi2_reservation_release(cmd); + else + return PYX_TRANSPORT_INVALID_CDB_FIELD; +} + +/* + * Begin SPC-3/SPC-4 Persistent Reservations emulation support + * + * This function is called by those initiator ports who are *NOT* + * the active PR reservation holder when a reservation is present. + */ +static int core_scsi3_pr_seq_non_holder( + struct se_cmd *cmd, + unsigned char *cdb, + u32 pr_reg_type) +{ + struct se_dev_entry *se_deve; + struct se_session *se_sess = SE_SESS(cmd); + int other_cdb = 0, ignore_reg; + int registered_nexus = 0, ret = 1; /* Conflict by default */ + int all_reg = 0, reg_only = 0; /* ALL_REG, REG_ONLY */ + int we = 0; /* Write Exclusive */ + int legacy = 0; /* Act like a legacy device and return + * RESERVATION CONFLICT on some CDBs */ + /* + * A legacy SPC-2 reservation is being held. + */ + if (cmd->se_dev->dev_flags & DF_SPC2_RESERVATIONS) + return core_scsi2_reservation_seq_non_holder(cmd, + cdb, pr_reg_type); + + se_deve = &se_sess->se_node_acl->device_list[cmd->orig_fe_lun]; + /* + * Determine if the registration should be ignored due to + * non-matching ISIDs in core_scsi3_pr_reservation_check(). + */ + ignore_reg = (pr_reg_type & 0x80000000); + if (ignore_reg) + pr_reg_type &= ~0x80000000; + + switch (pr_reg_type) { + case PR_TYPE_WRITE_EXCLUSIVE: + we = 1; + case PR_TYPE_EXCLUSIVE_ACCESS: + /* + * Some commands are only allowed for the persistent reservation + * holder. + */ + if ((se_deve->def_pr_registered) && !(ignore_reg)) + registered_nexus = 1; + break; + case PR_TYPE_WRITE_EXCLUSIVE_REGONLY: + we = 1; + case PR_TYPE_EXCLUSIVE_ACCESS_REGONLY: + /* + * Some commands are only allowed for registered I_T Nexuses. + */ + reg_only = 1; + if ((se_deve->def_pr_registered) && !(ignore_reg)) + registered_nexus = 1; + break; + case PR_TYPE_WRITE_EXCLUSIVE_ALLREG: + we = 1; + case PR_TYPE_EXCLUSIVE_ACCESS_ALLREG: + /* + * Each registered I_T Nexus is a reservation holder. + */ + all_reg = 1; + if ((se_deve->def_pr_registered) && !(ignore_reg)) + registered_nexus = 1; + break; + default: + return -1; + } + /* + * Referenced from spc4r17 table 45 for *NON* PR holder access + */ + switch (cdb[0]) { + case SECURITY_PROTOCOL_IN: + if (registered_nexus) + return 0; + ret = (we) ? 0 : 1; + break; + case MODE_SENSE: + case MODE_SENSE_10: + case READ_ATTRIBUTE: + case READ_BUFFER: + case RECEIVE_DIAGNOSTIC: + if (legacy) { + ret = 1; + break; + } + if (registered_nexus) { + ret = 0; + break; + } + ret = (we) ? 0 : 1; /* Allowed Write Exclusive */ + break; + case PERSISTENT_RESERVE_OUT: + /* + * This follows PERSISTENT_RESERVE_OUT service actions that + * are allowed in the presence of various reservations. + * See spc4r17, table 46 + */ + switch (cdb[1] & 0x1f) { + case PRO_CLEAR: + case PRO_PREEMPT: + case PRO_PREEMPT_AND_ABORT: + ret = (registered_nexus) ? 0 : 1; + break; + case PRO_REGISTER: + case PRO_REGISTER_AND_IGNORE_EXISTING_KEY: + ret = 0; + break; + case PRO_REGISTER_AND_MOVE: + case PRO_RESERVE: + ret = 1; + break; + case PRO_RELEASE: + ret = (registered_nexus) ? 0 : 1; + break; + default: + printk(KERN_ERR "Unknown PERSISTENT_RESERVE_OUT service" + " action: 0x%02x\n", cdb[1] & 0x1f); + return -1; + } + break; + case RELEASE: + case RELEASE_10: + /* Handled by CRH=1 in core_scsi2_emulate_crh() */ + ret = 0; + break; + case RESERVE: + case RESERVE_10: + /* Handled by CRH=1 in core_scsi2_emulate_crh() */ + ret = 0; + break; + case TEST_UNIT_READY: + ret = (legacy) ? 1 : 0; /* Conflict for legacy */ + break; + case MAINTENANCE_IN: + switch (cdb[1] & 0x1f) { + case MI_MANAGEMENT_PROTOCOL_IN: + if (registered_nexus) { + ret = 0; + break; + } + ret = (we) ? 0 : 1; /* Allowed Write Exclusive */ + break; + case MI_REPORT_SUPPORTED_OPERATION_CODES: + case MI_REPORT_SUPPORTED_TASK_MANAGEMENT_FUNCTIONS: + if (legacy) { + ret = 1; + break; + } + if (registered_nexus) { + ret = 0; + break; + } + ret = (we) ? 0 : 1; /* Allowed Write Exclusive */ + break; + case MI_REPORT_ALIASES: + case MI_REPORT_IDENTIFYING_INFORMATION: + case MI_REPORT_PRIORITY: + case MI_REPORT_TARGET_PGS: + case MI_REPORT_TIMESTAMP: + ret = 0; /* Allowed */ + break; + default: + printk(KERN_ERR "Unknown MI Service Action: 0x%02x\n", + (cdb[1] & 0x1f)); + return -1; + } + break; + case ACCESS_CONTROL_IN: + case ACCESS_CONTROL_OUT: + case INQUIRY: + case LOG_SENSE: + case READ_MEDIA_SERIAL_NUMBER: + case REPORT_LUNS: + case REQUEST_SENSE: + ret = 0; /*/ Allowed CDBs */ + break; + default: + other_cdb = 1; + break; + } + /* + * Case where the CDB is explictly allowed in the above switch + * statement. + */ + if (!(ret) && !(other_cdb)) { +#if 0 + printk(KERN_INFO "Allowing explict CDB: 0x%02x for %s" + " reservation holder\n", cdb[0], + core_scsi3_pr_dump_type(pr_reg_type)); +#endif + return ret; + } + /* + * Check if write exclusive initiator ports *NOT* holding the + * WRITE_EXCLUSIVE_* reservation. + */ + if ((we) && !(registered_nexus)) { + if (cmd->data_direction == DMA_TO_DEVICE) { + /* + * Conflict for write exclusive + */ + printk(KERN_INFO "%s Conflict for unregistered nexus" + " %s CDB: 0x%02x to %s reservation\n", + transport_dump_cmd_direction(cmd), + se_sess->se_node_acl->initiatorname, cdb[0], + core_scsi3_pr_dump_type(pr_reg_type)); + return 1; + } else { + /* + * Allow non WRITE CDBs for all Write Exclusive + * PR TYPEs to pass for registered and + * non-registered_nexuxes NOT holding the reservation. + * + * We only make noise for the unregisterd nexuses, + * as we expect registered non-reservation holding + * nexuses to issue CDBs. + */ +#if 0 + if (!(registered_nexus)) { + printk(KERN_INFO "Allowing implict CDB: 0x%02x" + " for %s reservation on unregistered" + " nexus\n", cdb[0], + core_scsi3_pr_dump_type(pr_reg_type)); + } +#endif + return 0; + } + } else if ((reg_only) || (all_reg)) { + if (registered_nexus) { + /* + * For PR_*_REG_ONLY and PR_*_ALL_REG reservations, + * allow commands from registered nexuses. + */ +#if 0 + printk(KERN_INFO "Allowing implict CDB: 0x%02x for %s" + " reservation\n", cdb[0], + core_scsi3_pr_dump_type(pr_reg_type)); +#endif + return 0; + } + } + printk(KERN_INFO "%s Conflict for %sregistered nexus %s CDB: 0x%2x" + " for %s reservation\n", transport_dump_cmd_direction(cmd), + (registered_nexus) ? "" : "un", + se_sess->se_node_acl->initiatorname, cdb[0], + core_scsi3_pr_dump_type(pr_reg_type)); + + return 1; /* Conflict by default */ +} + +static u32 core_scsi3_pr_generation(struct se_device *dev) +{ + struct se_subsystem_dev *su_dev = SU_DEV(dev); + u32 prg; + /* + * PRGeneration field shall contain the value of a 32-bit wrapping + * counter mainted by the device server. + * + * Note that this is done regardless of Active Persist across + * Target PowerLoss (APTPL) + * + * See spc4r17 section 6.3.12 READ_KEYS service action + */ + spin_lock(&dev->dev_reservation_lock); + prg = T10_RES(su_dev)->pr_generation++; + spin_unlock(&dev->dev_reservation_lock); + + return prg; +} + +static int core_scsi3_pr_reservation_check( + struct se_cmd *cmd, + u32 *pr_reg_type) +{ + struct se_device *dev = cmd->se_dev; + struct se_session *sess = cmd->se_sess; + int ret; + + if (!(sess)) + return 0; + /* + * A legacy SPC-2 reservation is being held. + */ + if (dev->dev_flags & DF_SPC2_RESERVATIONS) + return core_scsi2_reservation_check(cmd, pr_reg_type); + + spin_lock(&dev->dev_reservation_lock); + if (!(dev->dev_pr_res_holder)) { + spin_unlock(&dev->dev_reservation_lock); + return 0; + } + *pr_reg_type = dev->dev_pr_res_holder->pr_res_type; + cmd->pr_res_key = dev->dev_pr_res_holder->pr_res_key; + if (dev->dev_pr_res_holder->pr_reg_nacl != sess->se_node_acl) { + spin_unlock(&dev->dev_reservation_lock); + return -1; + } + if (!(dev->dev_pr_res_holder->isid_present_at_reg)) { + spin_unlock(&dev->dev_reservation_lock); + return 0; + } + ret = (dev->dev_pr_res_holder->pr_reg_bin_isid == + sess->sess_bin_isid) ? 0 : -1; + /* + * Use bit in *pr_reg_type to notify ISID mismatch in + * core_scsi3_pr_seq_non_holder(). + */ + if (ret != 0) + *pr_reg_type |= 0x80000000; + spin_unlock(&dev->dev_reservation_lock); + + return ret; +} + +static struct t10_pr_registration *__core_scsi3_do_alloc_registration( + struct se_device *dev, + struct se_node_acl *nacl, + struct se_dev_entry *deve, + unsigned char *isid, + u64 sa_res_key, + int all_tg_pt, + int aptpl) +{ + struct se_subsystem_dev *su_dev = SU_DEV(dev); + struct t10_pr_registration *pr_reg; + + pr_reg = kmem_cache_zalloc(t10_pr_reg_cache, GFP_ATOMIC); + if (!(pr_reg)) { + printk(KERN_ERR "Unable to allocate struct t10_pr_registration\n"); + return NULL; + } + + pr_reg->pr_aptpl_buf = kzalloc(T10_RES(su_dev)->pr_aptpl_buf_len, + GFP_ATOMIC); + if (!(pr_reg->pr_aptpl_buf)) { + printk(KERN_ERR "Unable to allocate pr_reg->pr_aptpl_buf\n"); + kmem_cache_free(t10_pr_reg_cache, pr_reg); + return NULL; + } + + INIT_LIST_HEAD(&pr_reg->pr_reg_list); + INIT_LIST_HEAD(&pr_reg->pr_reg_abort_list); + INIT_LIST_HEAD(&pr_reg->pr_reg_aptpl_list); + INIT_LIST_HEAD(&pr_reg->pr_reg_atp_list); + INIT_LIST_HEAD(&pr_reg->pr_reg_atp_mem_list); + atomic_set(&pr_reg->pr_res_holders, 0); + pr_reg->pr_reg_nacl = nacl; + pr_reg->pr_reg_deve = deve; + pr_reg->pr_res_mapped_lun = deve->mapped_lun; + pr_reg->pr_aptpl_target_lun = deve->se_lun->unpacked_lun; + pr_reg->pr_res_key = sa_res_key; + pr_reg->pr_reg_all_tg_pt = all_tg_pt; + pr_reg->pr_reg_aptpl = aptpl; + pr_reg->pr_reg_tg_pt_lun = deve->se_lun; + /* + * If an ISID value for this SCSI Initiator Port exists, + * save it to the registration now. + */ + if (isid != NULL) { + pr_reg->pr_reg_bin_isid = get_unaligned_be64(isid); + snprintf(pr_reg->pr_reg_isid, PR_REG_ISID_LEN, "%s", isid); + pr_reg->isid_present_at_reg = 1; + } + + return pr_reg; +} + +static int core_scsi3_lunacl_depend_item(struct se_dev_entry *); +static void core_scsi3_lunacl_undepend_item(struct se_dev_entry *); + +/* + * Function used for handling PR registrations for ALL_TG_PT=1 and ALL_TG_PT=0 + * modes. + */ +static struct t10_pr_registration *__core_scsi3_alloc_registration( + struct se_device *dev, + struct se_node_acl *nacl, + struct se_dev_entry *deve, + unsigned char *isid, + u64 sa_res_key, + int all_tg_pt, + int aptpl) +{ + struct se_dev_entry *deve_tmp; + struct se_node_acl *nacl_tmp; + struct se_port *port, *port_tmp; + struct target_core_fabric_ops *tfo = nacl->se_tpg->se_tpg_tfo; + struct t10_pr_registration *pr_reg, *pr_reg_atp, *pr_reg_tmp, *pr_reg_tmp_safe; + int ret; + /* + * Create a registration for the I_T Nexus upon which the + * PROUT REGISTER was received. + */ + pr_reg = __core_scsi3_do_alloc_registration(dev, nacl, deve, isid, + sa_res_key, all_tg_pt, aptpl); + if (!(pr_reg)) + return NULL; + /* + * Return pointer to pr_reg for ALL_TG_PT=0 + */ + if (!(all_tg_pt)) + return pr_reg; + /* + * Create list of matching SCSI Initiator Port registrations + * for ALL_TG_PT=1 + */ + spin_lock(&dev->se_port_lock); + list_for_each_entry_safe(port, port_tmp, &dev->dev_sep_list, sep_list) { + atomic_inc(&port->sep_tg_pt_ref_cnt); + smp_mb__after_atomic_inc(); + spin_unlock(&dev->se_port_lock); + + spin_lock_bh(&port->sep_alua_lock); + list_for_each_entry(deve_tmp, &port->sep_alua_list, + alua_port_list) { + /* + * This pointer will be NULL for demo mode MappedLUNs + * that have not been make explict via a ConfigFS + * MappedLUN group for the SCSI Initiator Node ACL. + */ + if (!(deve_tmp->se_lun_acl)) + continue; + + nacl_tmp = deve_tmp->se_lun_acl->se_lun_nacl; + /* + * Skip the matching struct se_node_acl that is allocated + * above.. + */ + if (nacl == nacl_tmp) + continue; + /* + * Only perform PR registrations for target ports on + * the same fabric module as the REGISTER w/ ALL_TG_PT=1 + * arrived. + */ + if (tfo != nacl_tmp->se_tpg->se_tpg_tfo) + continue; + /* + * Look for a matching Initiator Node ACL in ASCII format + */ + if (strcmp(nacl->initiatorname, nacl_tmp->initiatorname)) + continue; + + atomic_inc(&deve_tmp->pr_ref_count); + smp_mb__after_atomic_inc(); + spin_unlock_bh(&port->sep_alua_lock); + /* + * Grab a configfs group dependency that is released + * for the exception path at label out: below, or upon + * completion of adding ALL_TG_PT=1 registrations in + * __core_scsi3_add_registration() + */ + ret = core_scsi3_lunacl_depend_item(deve_tmp); + if (ret < 0) { + printk(KERN_ERR "core_scsi3_lunacl_depend" + "_item() failed\n"); + atomic_dec(&port->sep_tg_pt_ref_cnt); + smp_mb__after_atomic_dec(); + atomic_dec(&deve_tmp->pr_ref_count); + smp_mb__after_atomic_dec(); + goto out; + } + /* + * Located a matching SCSI Initiator Port on a different + * port, allocate the pr_reg_atp and attach it to the + * pr_reg->pr_reg_atp_list that will be processed once + * the original *pr_reg is processed in + * __core_scsi3_add_registration() + */ + pr_reg_atp = __core_scsi3_do_alloc_registration(dev, + nacl_tmp, deve_tmp, NULL, + sa_res_key, all_tg_pt, aptpl); + if (!(pr_reg_atp)) { + atomic_dec(&port->sep_tg_pt_ref_cnt); + smp_mb__after_atomic_dec(); + atomic_dec(&deve_tmp->pr_ref_count); + smp_mb__after_atomic_dec(); + core_scsi3_lunacl_undepend_item(deve_tmp); + goto out; + } + + list_add_tail(&pr_reg_atp->pr_reg_atp_mem_list, + &pr_reg->pr_reg_atp_list); + spin_lock_bh(&port->sep_alua_lock); + } + spin_unlock_bh(&port->sep_alua_lock); + + spin_lock(&dev->se_port_lock); + atomic_dec(&port->sep_tg_pt_ref_cnt); + smp_mb__after_atomic_dec(); + } + spin_unlock(&dev->se_port_lock); + + return pr_reg; +out: + list_for_each_entry_safe(pr_reg_tmp, pr_reg_tmp_safe, + &pr_reg->pr_reg_atp_list, pr_reg_atp_mem_list) { + list_del(&pr_reg_tmp->pr_reg_atp_mem_list); + core_scsi3_lunacl_undepend_item(pr_reg_tmp->pr_reg_deve); + kmem_cache_free(t10_pr_reg_cache, pr_reg_tmp); + } + kmem_cache_free(t10_pr_reg_cache, pr_reg); + return NULL; +} + +int core_scsi3_alloc_aptpl_registration( + struct t10_reservation_template *pr_tmpl, + u64 sa_res_key, + unsigned char *i_port, + unsigned char *isid, + u32 mapped_lun, + unsigned char *t_port, + u16 tpgt, + u32 target_lun, + int res_holder, + int all_tg_pt, + u8 type) +{ + struct t10_pr_registration *pr_reg; + + if (!(i_port) || !(t_port) || !(sa_res_key)) { + printk(KERN_ERR "Illegal parameters for APTPL registration\n"); + return -1; + } + + pr_reg = kmem_cache_zalloc(t10_pr_reg_cache, GFP_KERNEL); + if (!(pr_reg)) { + printk(KERN_ERR "Unable to allocate struct t10_pr_registration\n"); + return -1; + } + pr_reg->pr_aptpl_buf = kzalloc(pr_tmpl->pr_aptpl_buf_len, GFP_KERNEL); + + INIT_LIST_HEAD(&pr_reg->pr_reg_list); + INIT_LIST_HEAD(&pr_reg->pr_reg_abort_list); + INIT_LIST_HEAD(&pr_reg->pr_reg_aptpl_list); + INIT_LIST_HEAD(&pr_reg->pr_reg_atp_list); + INIT_LIST_HEAD(&pr_reg->pr_reg_atp_mem_list); + atomic_set(&pr_reg->pr_res_holders, 0); + pr_reg->pr_reg_nacl = NULL; + pr_reg->pr_reg_deve = NULL; + pr_reg->pr_res_mapped_lun = mapped_lun; + pr_reg->pr_aptpl_target_lun = target_lun; + pr_reg->pr_res_key = sa_res_key; + pr_reg->pr_reg_all_tg_pt = all_tg_pt; + pr_reg->pr_reg_aptpl = 1; + pr_reg->pr_reg_tg_pt_lun = NULL; + pr_reg->pr_res_scope = 0; /* Always LUN_SCOPE */ + pr_reg->pr_res_type = type; + /* + * If an ISID value had been saved in APTPL metadata for this + * SCSI Initiator Port, restore it now. + */ + if (isid != NULL) { + pr_reg->pr_reg_bin_isid = get_unaligned_be64(isid); + snprintf(pr_reg->pr_reg_isid, PR_REG_ISID_LEN, "%s", isid); + pr_reg->isid_present_at_reg = 1; + } + /* + * Copy the i_port and t_port information from caller. + */ + snprintf(pr_reg->pr_iport, PR_APTPL_MAX_IPORT_LEN, "%s", i_port); + snprintf(pr_reg->pr_tport, PR_APTPL_MAX_TPORT_LEN, "%s", t_port); + pr_reg->pr_reg_tpgt = tpgt; + /* + * Set pr_res_holder from caller, the pr_reg who is the reservation + * holder will get it's pointer set in core_scsi3_aptpl_reserve() once + * the Initiator Node LUN ACL from the fabric module is created for + * this registration. + */ + pr_reg->pr_res_holder = res_holder; + + list_add_tail(&pr_reg->pr_reg_aptpl_list, &pr_tmpl->aptpl_reg_list); + printk(KERN_INFO "SPC-3 PR APTPL Successfully added registration%s from" + " metadata\n", (res_holder) ? "+reservation" : ""); + return 0; +} + +static void core_scsi3_aptpl_reserve( + struct se_device *dev, + struct se_portal_group *tpg, + struct se_node_acl *node_acl, + struct t10_pr_registration *pr_reg) +{ + char i_buf[PR_REG_ISID_ID_LEN]; + int prf_isid; + + memset(i_buf, 0, PR_REG_ISID_ID_LEN); + prf_isid = core_pr_dump_initiator_port(pr_reg, &i_buf[0], + PR_REG_ISID_ID_LEN); + + spin_lock(&dev->dev_reservation_lock); + dev->dev_pr_res_holder = pr_reg; + spin_unlock(&dev->dev_reservation_lock); + + printk(KERN_INFO "SPC-3 PR [%s] Service Action: APTPL RESERVE created" + " new reservation holder TYPE: %s ALL_TG_PT: %d\n", + TPG_TFO(tpg)->get_fabric_name(), + core_scsi3_pr_dump_type(pr_reg->pr_res_type), + (pr_reg->pr_reg_all_tg_pt) ? 1 : 0); + printk(KERN_INFO "SPC-3 PR [%s] RESERVE Node: %s%s\n", + TPG_TFO(tpg)->get_fabric_name(), node_acl->initiatorname, + (prf_isid) ? &i_buf[0] : ""); +} + +static void __core_scsi3_add_registration(struct se_device *, struct se_node_acl *, + struct t10_pr_registration *, int, int); + +static int __core_scsi3_check_aptpl_registration( + struct se_device *dev, + struct se_portal_group *tpg, + struct se_lun *lun, + u32 target_lun, + struct se_node_acl *nacl, + struct se_dev_entry *deve) +{ + struct t10_pr_registration *pr_reg, *pr_reg_tmp; + struct t10_reservation_template *pr_tmpl = &SU_DEV(dev)->t10_reservation; + unsigned char i_port[PR_APTPL_MAX_IPORT_LEN]; + unsigned char t_port[PR_APTPL_MAX_TPORT_LEN]; + u16 tpgt; + + memset(i_port, 0, PR_APTPL_MAX_IPORT_LEN); + memset(t_port, 0, PR_APTPL_MAX_TPORT_LEN); + /* + * Copy Initiator Port information from struct se_node_acl + */ + snprintf(i_port, PR_APTPL_MAX_IPORT_LEN, "%s", nacl->initiatorname); + snprintf(t_port, PR_APTPL_MAX_TPORT_LEN, "%s", + TPG_TFO(tpg)->tpg_get_wwn(tpg)); + tpgt = TPG_TFO(tpg)->tpg_get_tag(tpg); + /* + * Look for the matching registrations+reservation from those + * created from APTPL metadata. Note that multiple registrations + * may exist for fabrics that use ISIDs in their SCSI Initiator Port + * TransportIDs. + */ + spin_lock(&pr_tmpl->aptpl_reg_lock); + list_for_each_entry_safe(pr_reg, pr_reg_tmp, &pr_tmpl->aptpl_reg_list, + pr_reg_aptpl_list) { + if (!(strcmp(pr_reg->pr_iport, i_port)) && + (pr_reg->pr_res_mapped_lun == deve->mapped_lun) && + !(strcmp(pr_reg->pr_tport, t_port)) && + (pr_reg->pr_reg_tpgt == tpgt) && + (pr_reg->pr_aptpl_target_lun == target_lun)) { + + pr_reg->pr_reg_nacl = nacl; + pr_reg->pr_reg_deve = deve; + pr_reg->pr_reg_tg_pt_lun = lun; + + list_del(&pr_reg->pr_reg_aptpl_list); + spin_unlock(&pr_tmpl->aptpl_reg_lock); + /* + * At this point all of the pointers in *pr_reg will + * be setup, so go ahead and add the registration. + */ + + __core_scsi3_add_registration(dev, nacl, pr_reg, 0, 0); + /* + * If this registration is the reservation holder, + * make that happen now.. + */ + if (pr_reg->pr_res_holder) + core_scsi3_aptpl_reserve(dev, tpg, + nacl, pr_reg); + /* + * Reenable pr_aptpl_active to accept new metadata + * updates once the SCSI device is active again.. + */ + spin_lock(&pr_tmpl->aptpl_reg_lock); + pr_tmpl->pr_aptpl_active = 1; + } + } + spin_unlock(&pr_tmpl->aptpl_reg_lock); + + return 0; +} + +int core_scsi3_check_aptpl_registration( + struct se_device *dev, + struct se_portal_group *tpg, + struct se_lun *lun, + struct se_lun_acl *lun_acl) +{ + struct se_subsystem_dev *su_dev = SU_DEV(dev); + struct se_node_acl *nacl = lun_acl->se_lun_nacl; + struct se_dev_entry *deve = &nacl->device_list[lun_acl->mapped_lun]; + + if (T10_RES(su_dev)->res_type != SPC3_PERSISTENT_RESERVATIONS) + return 0; + + return __core_scsi3_check_aptpl_registration(dev, tpg, lun, + lun->unpacked_lun, nacl, deve); +} + +static void __core_scsi3_dump_registration( + struct target_core_fabric_ops *tfo, + struct se_device *dev, + struct se_node_acl *nacl, + struct t10_pr_registration *pr_reg, + int register_type) +{ + struct se_portal_group *se_tpg = nacl->se_tpg; + char i_buf[PR_REG_ISID_ID_LEN]; + int prf_isid; + + memset(&i_buf[0], 0, PR_REG_ISID_ID_LEN); + prf_isid = core_pr_dump_initiator_port(pr_reg, &i_buf[0], + PR_REG_ISID_ID_LEN); + + printk(KERN_INFO "SPC-3 PR [%s] Service Action: REGISTER%s Initiator" + " Node: %s%s\n", tfo->get_fabric_name(), (register_type == 2) ? + "_AND_MOVE" : (register_type == 1) ? + "_AND_IGNORE_EXISTING_KEY" : "", nacl->initiatorname, + (prf_isid) ? i_buf : ""); + printk(KERN_INFO "SPC-3 PR [%s] registration on Target Port: %s,0x%04x\n", + tfo->get_fabric_name(), tfo->tpg_get_wwn(se_tpg), + tfo->tpg_get_tag(se_tpg)); + printk(KERN_INFO "SPC-3 PR [%s] for %s TCM Subsystem %s Object Target" + " Port(s)\n", tfo->get_fabric_name(), + (pr_reg->pr_reg_all_tg_pt) ? "ALL" : "SINGLE", + TRANSPORT(dev)->name); + printk(KERN_INFO "SPC-3 PR [%s] SA Res Key: 0x%016Lx PRgeneration:" + " 0x%08x APTPL: %d\n", tfo->get_fabric_name(), + pr_reg->pr_res_key, pr_reg->pr_res_generation, + pr_reg->pr_reg_aptpl); +} + +/* + * this function can be called with struct se_device->dev_reservation_lock + * when register_move = 1 + */ +static void __core_scsi3_add_registration( + struct se_device *dev, + struct se_node_acl *nacl, + struct t10_pr_registration *pr_reg, + int register_type, + int register_move) +{ + struct se_subsystem_dev *su_dev = SU_DEV(dev); + struct target_core_fabric_ops *tfo = nacl->se_tpg->se_tpg_tfo; + struct t10_pr_registration *pr_reg_tmp, *pr_reg_tmp_safe; + struct t10_reservation_template *pr_tmpl = &SU_DEV(dev)->t10_reservation; + + /* + * Increment PRgeneration counter for struct se_device upon a successful + * REGISTER, see spc4r17 section 6.3.2 READ_KEYS service action + * + * Also, when register_move = 1 for PROUT REGISTER_AND_MOVE service + * action, the struct se_device->dev_reservation_lock will already be held, + * so we do not call core_scsi3_pr_generation() which grabs the lock + * for the REGISTER. + */ + pr_reg->pr_res_generation = (register_move) ? + T10_RES(su_dev)->pr_generation++ : + core_scsi3_pr_generation(dev); + + spin_lock(&pr_tmpl->registration_lock); + list_add_tail(&pr_reg->pr_reg_list, &pr_tmpl->registration_list); + pr_reg->pr_reg_deve->def_pr_registered = 1; + + __core_scsi3_dump_registration(tfo, dev, nacl, pr_reg, register_type); + spin_unlock(&pr_tmpl->registration_lock); + /* + * Skip extra processing for ALL_TG_PT=0 or REGISTER_AND_MOVE. + */ + if (!(pr_reg->pr_reg_all_tg_pt) || (register_move)) + return; + /* + * Walk pr_reg->pr_reg_atp_list and add registrations for ALL_TG_PT=1 + * allocated in __core_scsi3_alloc_registration() + */ + list_for_each_entry_safe(pr_reg_tmp, pr_reg_tmp_safe, + &pr_reg->pr_reg_atp_list, pr_reg_atp_mem_list) { + list_del(&pr_reg_tmp->pr_reg_atp_mem_list); + + pr_reg_tmp->pr_res_generation = core_scsi3_pr_generation(dev); + + spin_lock(&pr_tmpl->registration_lock); + list_add_tail(&pr_reg_tmp->pr_reg_list, + &pr_tmpl->registration_list); + pr_reg_tmp->pr_reg_deve->def_pr_registered = 1; + + __core_scsi3_dump_registration(tfo, dev, + pr_reg_tmp->pr_reg_nacl, pr_reg_tmp, + register_type); + spin_unlock(&pr_tmpl->registration_lock); + /* + * Drop configfs group dependency reference from + * __core_scsi3_alloc_registration() + */ + core_scsi3_lunacl_undepend_item(pr_reg_tmp->pr_reg_deve); + } +} + +static int core_scsi3_alloc_registration( + struct se_device *dev, + struct se_node_acl *nacl, + struct se_dev_entry *deve, + unsigned char *isid, + u64 sa_res_key, + int all_tg_pt, + int aptpl, + int register_type, + int register_move) +{ + struct t10_pr_registration *pr_reg; + + pr_reg = __core_scsi3_alloc_registration(dev, nacl, deve, isid, + sa_res_key, all_tg_pt, aptpl); + if (!(pr_reg)) + return -1; + + __core_scsi3_add_registration(dev, nacl, pr_reg, + register_type, register_move); + return 0; +} + +static struct t10_pr_registration *__core_scsi3_locate_pr_reg( + struct se_device *dev, + struct se_node_acl *nacl, + unsigned char *isid) +{ + struct t10_reservation_template *pr_tmpl = &SU_DEV(dev)->t10_reservation; + struct t10_pr_registration *pr_reg, *pr_reg_tmp; + struct se_portal_group *tpg; + + spin_lock(&pr_tmpl->registration_lock); + list_for_each_entry_safe(pr_reg, pr_reg_tmp, + &pr_tmpl->registration_list, pr_reg_list) { + /* + * First look for a matching struct se_node_acl + */ + if (pr_reg->pr_reg_nacl != nacl) + continue; + + tpg = pr_reg->pr_reg_nacl->se_tpg; + /* + * If this registration does NOT contain a fabric provided + * ISID, then we have found a match. + */ + if (!(pr_reg->isid_present_at_reg)) { + /* + * Determine if this SCSI device server requires that + * SCSI Intiatior TransportID w/ ISIDs is enforced + * for fabric modules (iSCSI) requiring them. + */ + if (TPG_TFO(tpg)->sess_get_initiator_sid != NULL) { + if (DEV_ATTRIB(dev)->enforce_pr_isids) + continue; + } + atomic_inc(&pr_reg->pr_res_holders); + smp_mb__after_atomic_inc(); + spin_unlock(&pr_tmpl->registration_lock); + return pr_reg; + } + /* + * If the *pr_reg contains a fabric defined ISID for multi-value + * SCSI Initiator Port TransportIDs, then we expect a valid + * matching ISID to be provided by the local SCSI Initiator Port. + */ + if (!(isid)) + continue; + if (strcmp(isid, pr_reg->pr_reg_isid)) + continue; + + atomic_inc(&pr_reg->pr_res_holders); + smp_mb__after_atomic_inc(); + spin_unlock(&pr_tmpl->registration_lock); + return pr_reg; + } + spin_unlock(&pr_tmpl->registration_lock); + + return NULL; +} + +static struct t10_pr_registration *core_scsi3_locate_pr_reg( + struct se_device *dev, + struct se_node_acl *nacl, + struct se_session *sess) +{ + struct se_portal_group *tpg = nacl->se_tpg; + unsigned char buf[PR_REG_ISID_LEN], *isid_ptr = NULL; + + if (TPG_TFO(tpg)->sess_get_initiator_sid != NULL) { + memset(&buf[0], 0, PR_REG_ISID_LEN); + TPG_TFO(tpg)->sess_get_initiator_sid(sess, &buf[0], + PR_REG_ISID_LEN); + isid_ptr = &buf[0]; + } + + return __core_scsi3_locate_pr_reg(dev, nacl, isid_ptr); +} + +static void core_scsi3_put_pr_reg(struct t10_pr_registration *pr_reg) +{ + atomic_dec(&pr_reg->pr_res_holders); + smp_mb__after_atomic_dec(); +} + +static int core_scsi3_check_implict_release( + struct se_device *dev, + struct t10_pr_registration *pr_reg) +{ + struct se_node_acl *nacl = pr_reg->pr_reg_nacl; + struct t10_pr_registration *pr_res_holder; + int ret = 0; + + spin_lock(&dev->dev_reservation_lock); + pr_res_holder = dev->dev_pr_res_holder; + if (!(pr_res_holder)) { + spin_unlock(&dev->dev_reservation_lock); + return ret; + } + if (pr_res_holder == pr_reg) { + /* + * Perform an implict RELEASE if the registration that + * is being released is holding the reservation. + * + * From spc4r17, section 5.7.11.1: + * + * e) If the I_T nexus is the persistent reservation holder + * and the persistent reservation is not an all registrants + * type, then a PERSISTENT RESERVE OUT command with REGISTER + * service action or REGISTER AND IGNORE EXISTING KEY + * service action with the SERVICE ACTION RESERVATION KEY + * field set to zero (see 5.7.11.3). + */ + __core_scsi3_complete_pro_release(dev, nacl, pr_reg, 0); + ret = 1; + /* + * For 'All Registrants' reservation types, all existing + * registrations are still processed as reservation holders + * in core_scsi3_pr_seq_non_holder() after the initial + * reservation holder is implictly released here. + */ + } else if (pr_reg->pr_reg_all_tg_pt && + (!strcmp(pr_res_holder->pr_reg_nacl->initiatorname, + pr_reg->pr_reg_nacl->initiatorname)) && + (pr_res_holder->pr_res_key == pr_reg->pr_res_key)) { + printk(KERN_ERR "SPC-3 PR: Unable to perform ALL_TG_PT=1" + " UNREGISTER while existing reservation with matching" + " key 0x%016Lx is present from another SCSI Initiator" + " Port\n", pr_reg->pr_res_key); + ret = -1; + } + spin_unlock(&dev->dev_reservation_lock); + + return ret; +} + +/* + * Called with struct t10_reservation_template->registration_lock held. + */ +static void __core_scsi3_free_registration( + struct se_device *dev, + struct t10_pr_registration *pr_reg, + struct list_head *preempt_and_abort_list, + int dec_holders) +{ + struct target_core_fabric_ops *tfo = + pr_reg->pr_reg_nacl->se_tpg->se_tpg_tfo; + struct t10_reservation_template *pr_tmpl = &SU_DEV(dev)->t10_reservation; + char i_buf[PR_REG_ISID_ID_LEN]; + int prf_isid; + + memset(i_buf, 0, PR_REG_ISID_ID_LEN); + prf_isid = core_pr_dump_initiator_port(pr_reg, &i_buf[0], + PR_REG_ISID_ID_LEN); + + pr_reg->pr_reg_deve->def_pr_registered = 0; + pr_reg->pr_reg_deve->pr_res_key = 0; + list_del(&pr_reg->pr_reg_list); + /* + * Caller accessing *pr_reg using core_scsi3_locate_pr_reg(), + * so call core_scsi3_put_pr_reg() to decrement our reference. + */ + if (dec_holders) + core_scsi3_put_pr_reg(pr_reg); + /* + * Wait until all reference from any other I_T nexuses for this + * *pr_reg have been released. Because list_del() is called above, + * the last core_scsi3_put_pr_reg(pr_reg) will release this reference + * count back to zero, and we release *pr_reg. + */ + while (atomic_read(&pr_reg->pr_res_holders) != 0) { + spin_unlock(&pr_tmpl->registration_lock); + printk("SPC-3 PR [%s] waiting for pr_res_holders\n", + tfo->get_fabric_name()); + cpu_relax(); + spin_lock(&pr_tmpl->registration_lock); + } + + printk(KERN_INFO "SPC-3 PR [%s] Service Action: UNREGISTER Initiator" + " Node: %s%s\n", tfo->get_fabric_name(), + pr_reg->pr_reg_nacl->initiatorname, + (prf_isid) ? &i_buf[0] : ""); + printk(KERN_INFO "SPC-3 PR [%s] for %s TCM Subsystem %s Object Target" + " Port(s)\n", tfo->get_fabric_name(), + (pr_reg->pr_reg_all_tg_pt) ? "ALL" : "SINGLE", + TRANSPORT(dev)->name); + printk(KERN_INFO "SPC-3 PR [%s] SA Res Key: 0x%016Lx PRgeneration:" + " 0x%08x\n", tfo->get_fabric_name(), pr_reg->pr_res_key, + pr_reg->pr_res_generation); + + if (!(preempt_and_abort_list)) { + pr_reg->pr_reg_deve = NULL; + pr_reg->pr_reg_nacl = NULL; + kfree(pr_reg->pr_aptpl_buf); + kmem_cache_free(t10_pr_reg_cache, pr_reg); + return; + } + /* + * For PREEMPT_AND_ABORT, the list of *pr_reg in preempt_and_abort_list + * are released once the ABORT_TASK_SET has completed.. + */ + list_add_tail(&pr_reg->pr_reg_abort_list, preempt_and_abort_list); +} + +void core_scsi3_free_pr_reg_from_nacl( + struct se_device *dev, + struct se_node_acl *nacl) +{ + struct t10_reservation_template *pr_tmpl = &SU_DEV(dev)->t10_reservation; + struct t10_pr_registration *pr_reg, *pr_reg_tmp, *pr_res_holder; + /* + * If the passed se_node_acl matches the reservation holder, + * release the reservation. + */ + spin_lock(&dev->dev_reservation_lock); + pr_res_holder = dev->dev_pr_res_holder; + if ((pr_res_holder != NULL) && + (pr_res_holder->pr_reg_nacl == nacl)) + __core_scsi3_complete_pro_release(dev, nacl, pr_res_holder, 0); + spin_unlock(&dev->dev_reservation_lock); + /* + * Release any registration associated with the struct se_node_acl. + */ + spin_lock(&pr_tmpl->registration_lock); + list_for_each_entry_safe(pr_reg, pr_reg_tmp, + &pr_tmpl->registration_list, pr_reg_list) { + + if (pr_reg->pr_reg_nacl != nacl) + continue; + + __core_scsi3_free_registration(dev, pr_reg, NULL, 0); + } + spin_unlock(&pr_tmpl->registration_lock); +} + +void core_scsi3_free_all_registrations( + struct se_device *dev) +{ + struct t10_reservation_template *pr_tmpl = &SU_DEV(dev)->t10_reservation; + struct t10_pr_registration *pr_reg, *pr_reg_tmp, *pr_res_holder; + + spin_lock(&dev->dev_reservation_lock); + pr_res_holder = dev->dev_pr_res_holder; + if (pr_res_holder != NULL) { + struct se_node_acl *pr_res_nacl = pr_res_holder->pr_reg_nacl; + __core_scsi3_complete_pro_release(dev, pr_res_nacl, + pr_res_holder, 0); + } + spin_unlock(&dev->dev_reservation_lock); + + spin_lock(&pr_tmpl->registration_lock); + list_for_each_entry_safe(pr_reg, pr_reg_tmp, + &pr_tmpl->registration_list, pr_reg_list) { + + __core_scsi3_free_registration(dev, pr_reg, NULL, 0); + } + spin_unlock(&pr_tmpl->registration_lock); + + spin_lock(&pr_tmpl->aptpl_reg_lock); + list_for_each_entry_safe(pr_reg, pr_reg_tmp, &pr_tmpl->aptpl_reg_list, + pr_reg_aptpl_list) { + list_del(&pr_reg->pr_reg_aptpl_list); + kfree(pr_reg->pr_aptpl_buf); + kmem_cache_free(t10_pr_reg_cache, pr_reg); + } + spin_unlock(&pr_tmpl->aptpl_reg_lock); +} + +static int core_scsi3_tpg_depend_item(struct se_portal_group *tpg) +{ + return configfs_depend_item(TPG_TFO(tpg)->tf_subsys, + &tpg->tpg_group.cg_item); +} + +static void core_scsi3_tpg_undepend_item(struct se_portal_group *tpg) +{ + configfs_undepend_item(TPG_TFO(tpg)->tf_subsys, + &tpg->tpg_group.cg_item); + + atomic_dec(&tpg->tpg_pr_ref_count); + smp_mb__after_atomic_dec(); +} + +static int core_scsi3_nodeacl_depend_item(struct se_node_acl *nacl) +{ + struct se_portal_group *tpg = nacl->se_tpg; + + if (nacl->dynamic_node_acl) + return 0; + + return configfs_depend_item(TPG_TFO(tpg)->tf_subsys, + &nacl->acl_group.cg_item); +} + +static void core_scsi3_nodeacl_undepend_item(struct se_node_acl *nacl) +{ + struct se_portal_group *tpg = nacl->se_tpg; + + if (nacl->dynamic_node_acl) { + atomic_dec(&nacl->acl_pr_ref_count); + smp_mb__after_atomic_dec(); + return; + } + + configfs_undepend_item(TPG_TFO(tpg)->tf_subsys, + &nacl->acl_group.cg_item); + + atomic_dec(&nacl->acl_pr_ref_count); + smp_mb__after_atomic_dec(); +} + +static int core_scsi3_lunacl_depend_item(struct se_dev_entry *se_deve) +{ + struct se_lun_acl *lun_acl = se_deve->se_lun_acl; + struct se_node_acl *nacl; + struct se_portal_group *tpg; + /* + * For nacl->dynamic_node_acl=1 + */ + if (!(lun_acl)) + return 0; + + nacl = lun_acl->se_lun_nacl; + tpg = nacl->se_tpg; + + return configfs_depend_item(TPG_TFO(tpg)->tf_subsys, + &lun_acl->se_lun_group.cg_item); +} + +static void core_scsi3_lunacl_undepend_item(struct se_dev_entry *se_deve) +{ + struct se_lun_acl *lun_acl = se_deve->se_lun_acl; + struct se_node_acl *nacl; + struct se_portal_group *tpg; + /* + * For nacl->dynamic_node_acl=1 + */ + if (!(lun_acl)) { + atomic_dec(&se_deve->pr_ref_count); + smp_mb__after_atomic_dec(); + return; + } + nacl = lun_acl->se_lun_nacl; + tpg = nacl->se_tpg; + + configfs_undepend_item(TPG_TFO(tpg)->tf_subsys, + &lun_acl->se_lun_group.cg_item); + + atomic_dec(&se_deve->pr_ref_count); + smp_mb__after_atomic_dec(); +} + +static int core_scsi3_decode_spec_i_port( + struct se_cmd *cmd, + struct se_portal_group *tpg, + unsigned char *l_isid, + u64 sa_res_key, + int all_tg_pt, + int aptpl) +{ + struct se_device *dev = SE_DEV(cmd); + struct se_port *tmp_port; + struct se_portal_group *dest_tpg = NULL, *tmp_tpg; + struct se_session *se_sess = SE_SESS(cmd); + struct se_node_acl *dest_node_acl = NULL; + struct se_dev_entry *dest_se_deve = NULL, *local_se_deve; + struct t10_pr_registration *dest_pr_reg, *local_pr_reg, *pr_reg_e; + struct t10_pr_registration *pr_reg_tmp, *pr_reg_tmp_safe; + struct list_head tid_dest_list; + struct pr_transport_id_holder *tidh_new, *tidh, *tidh_tmp; + struct target_core_fabric_ops *tmp_tf_ops; + unsigned char *buf = (unsigned char *)T_TASK(cmd)->t_task_buf; + unsigned char *ptr, *i_str = NULL, proto_ident, tmp_proto_ident; + char *iport_ptr = NULL, dest_iport[64], i_buf[PR_REG_ISID_ID_LEN]; + u32 tpdl, tid_len = 0; + int ret, dest_local_nexus, prf_isid; + u32 dest_rtpi = 0; + + memset(dest_iport, 0, 64); + INIT_LIST_HEAD(&tid_dest_list); + + local_se_deve = &se_sess->se_node_acl->device_list[cmd->orig_fe_lun]; + /* + * Allocate a struct pr_transport_id_holder and setup the + * local_node_acl and local_se_deve pointers and add to + * struct list_head tid_dest_list for add registration + * processing in the loop of tid_dest_list below. + */ + tidh_new = kzalloc(sizeof(struct pr_transport_id_holder), GFP_KERNEL); + if (!(tidh_new)) { + printk(KERN_ERR "Unable to allocate tidh_new\n"); + return PYX_TRANSPORT_LU_COMM_FAILURE; + } + INIT_LIST_HEAD(&tidh_new->dest_list); + tidh_new->dest_tpg = tpg; + tidh_new->dest_node_acl = se_sess->se_node_acl; + tidh_new->dest_se_deve = local_se_deve; + + local_pr_reg = __core_scsi3_alloc_registration(SE_DEV(cmd), + se_sess->se_node_acl, local_se_deve, l_isid, + sa_res_key, all_tg_pt, aptpl); + if (!(local_pr_reg)) { + kfree(tidh_new); + return PYX_TRANSPORT_LU_COMM_FAILURE; + } + tidh_new->dest_pr_reg = local_pr_reg; + /* + * The local I_T nexus does not hold any configfs dependances, + * so we set tid_h->dest_local_nexus=1 to prevent the + * configfs_undepend_item() calls in the tid_dest_list loops below. + */ + tidh_new->dest_local_nexus = 1; + list_add_tail(&tidh_new->dest_list, &tid_dest_list); + /* + * For a PERSISTENT RESERVE OUT specify initiator ports payload, + * first extract TransportID Parameter Data Length, and make sure + * the value matches up to the SCSI expected data transfer length. + */ + tpdl = (buf[24] & 0xff) << 24; + tpdl |= (buf[25] & 0xff) << 16; + tpdl |= (buf[26] & 0xff) << 8; + tpdl |= buf[27] & 0xff; + + if ((tpdl + 28) != cmd->data_length) { + printk(KERN_ERR "SPC-3 PR: Illegal tpdl: %u + 28 byte header" + " does not equal CDB data_length: %u\n", tpdl, + cmd->data_length); + ret = PYX_TRANSPORT_INVALID_PARAMETER_LIST; + goto out; + } + /* + * Start processing the received transport IDs using the + * receiving I_T Nexus portal's fabric dependent methods to + * obtain the SCSI Initiator Port/Device Identifiers. + */ + ptr = &buf[28]; + + while (tpdl > 0) { + proto_ident = (ptr[0] & 0x0f); + dest_tpg = NULL; + + spin_lock(&dev->se_port_lock); + list_for_each_entry(tmp_port, &dev->dev_sep_list, sep_list) { + tmp_tpg = tmp_port->sep_tpg; + if (!(tmp_tpg)) + continue; + tmp_tf_ops = TPG_TFO(tmp_tpg); + if (!(tmp_tf_ops)) + continue; + if (!(tmp_tf_ops->get_fabric_proto_ident) || + !(tmp_tf_ops->tpg_parse_pr_out_transport_id)) + continue; + /* + * Look for the matching proto_ident provided by + * the received TransportID + */ + tmp_proto_ident = tmp_tf_ops->get_fabric_proto_ident(tmp_tpg); + if (tmp_proto_ident != proto_ident) + continue; + dest_rtpi = tmp_port->sep_rtpi; + + i_str = tmp_tf_ops->tpg_parse_pr_out_transport_id( + tmp_tpg, (const char *)ptr, &tid_len, + &iport_ptr); + if (!(i_str)) + continue; + + atomic_inc(&tmp_tpg->tpg_pr_ref_count); + smp_mb__after_atomic_inc(); + spin_unlock(&dev->se_port_lock); + + ret = core_scsi3_tpg_depend_item(tmp_tpg); + if (ret != 0) { + printk(KERN_ERR " core_scsi3_tpg_depend_item()" + " for tmp_tpg\n"); + atomic_dec(&tmp_tpg->tpg_pr_ref_count); + smp_mb__after_atomic_dec(); + ret = PYX_TRANSPORT_LU_COMM_FAILURE; + goto out; + } + /* + * Locate the desination initiator ACL to be registered + * from the decoded fabric module specific TransportID + * at *i_str. + */ + spin_lock_bh(&tmp_tpg->acl_node_lock); + dest_node_acl = __core_tpg_get_initiator_node_acl( + tmp_tpg, i_str); + if (dest_node_acl) { + atomic_inc(&dest_node_acl->acl_pr_ref_count); + smp_mb__after_atomic_inc(); + } + spin_unlock_bh(&tmp_tpg->acl_node_lock); + + if (!(dest_node_acl)) { + core_scsi3_tpg_undepend_item(tmp_tpg); + spin_lock(&dev->se_port_lock); + continue; + } + + ret = core_scsi3_nodeacl_depend_item(dest_node_acl); + if (ret != 0) { + printk(KERN_ERR "configfs_depend_item() failed" + " for dest_node_acl->acl_group\n"); + atomic_dec(&dest_node_acl->acl_pr_ref_count); + smp_mb__after_atomic_dec(); + core_scsi3_tpg_undepend_item(tmp_tpg); + ret = PYX_TRANSPORT_LU_COMM_FAILURE; + goto out; + } + + dest_tpg = tmp_tpg; + printk(KERN_INFO "SPC-3 PR SPEC_I_PT: Located %s Node:" + " %s Port RTPI: %hu\n", + TPG_TFO(dest_tpg)->get_fabric_name(), + dest_node_acl->initiatorname, dest_rtpi); + + spin_lock(&dev->se_port_lock); + break; + } + spin_unlock(&dev->se_port_lock); + + if (!(dest_tpg)) { + printk(KERN_ERR "SPC-3 PR SPEC_I_PT: Unable to locate" + " dest_tpg\n"); + ret = PYX_TRANSPORT_INVALID_PARAMETER_LIST; + goto out; + } +#if 0 + printk("SPC-3 PR SPEC_I_PT: Got %s data_length: %u tpdl: %u" + " tid_len: %d for %s + %s\n", + TPG_TFO(dest_tpg)->get_fabric_name(), cmd->data_length, + tpdl, tid_len, i_str, iport_ptr); +#endif + if (tid_len > tpdl) { + printk(KERN_ERR "SPC-3 PR SPEC_I_PT: Illegal tid_len:" + " %u for Transport ID: %s\n", tid_len, ptr); + core_scsi3_nodeacl_undepend_item(dest_node_acl); + core_scsi3_tpg_undepend_item(dest_tpg); + ret = PYX_TRANSPORT_INVALID_PARAMETER_LIST; + goto out; + } + /* + * Locate the desintation struct se_dev_entry pointer for matching + * RELATIVE TARGET PORT IDENTIFIER on the receiving I_T Nexus + * Target Port. + */ + dest_se_deve = core_get_se_deve_from_rtpi(dest_node_acl, + dest_rtpi); + if (!(dest_se_deve)) { + printk(KERN_ERR "Unable to locate %s dest_se_deve" + " from destination RTPI: %hu\n", + TPG_TFO(dest_tpg)->get_fabric_name(), + dest_rtpi); + + core_scsi3_nodeacl_undepend_item(dest_node_acl); + core_scsi3_tpg_undepend_item(dest_tpg); + ret = PYX_TRANSPORT_INVALID_PARAMETER_LIST; + goto out; + } + + ret = core_scsi3_lunacl_depend_item(dest_se_deve); + if (ret < 0) { + printk(KERN_ERR "core_scsi3_lunacl_depend_item()" + " failed\n"); + atomic_dec(&dest_se_deve->pr_ref_count); + smp_mb__after_atomic_dec(); + core_scsi3_nodeacl_undepend_item(dest_node_acl); + core_scsi3_tpg_undepend_item(dest_tpg); + ret = PYX_TRANSPORT_LU_COMM_FAILURE; + goto out; + } +#if 0 + printk(KERN_INFO "SPC-3 PR SPEC_I_PT: Located %s Node: %s" + " dest_se_deve mapped_lun: %u\n", + TPG_TFO(dest_tpg)->get_fabric_name(), + dest_node_acl->initiatorname, dest_se_deve->mapped_lun); +#endif + /* + * Skip any TransportIDs that already have a registration for + * this target port. + */ + pr_reg_e = __core_scsi3_locate_pr_reg(dev, dest_node_acl, + iport_ptr); + if (pr_reg_e) { + core_scsi3_put_pr_reg(pr_reg_e); + core_scsi3_lunacl_undepend_item(dest_se_deve); + core_scsi3_nodeacl_undepend_item(dest_node_acl); + core_scsi3_tpg_undepend_item(dest_tpg); + ptr += tid_len; + tpdl -= tid_len; + tid_len = 0; + continue; + } + /* + * Allocate a struct pr_transport_id_holder and setup + * the dest_node_acl and dest_se_deve pointers for the + * loop below. + */ + tidh_new = kzalloc(sizeof(struct pr_transport_id_holder), + GFP_KERNEL); + if (!(tidh_new)) { + printk(KERN_ERR "Unable to allocate tidh_new\n"); + core_scsi3_lunacl_undepend_item(dest_se_deve); + core_scsi3_nodeacl_undepend_item(dest_node_acl); + core_scsi3_tpg_undepend_item(dest_tpg); + ret = PYX_TRANSPORT_LU_COMM_FAILURE; + goto out; + } + INIT_LIST_HEAD(&tidh_new->dest_list); + tidh_new->dest_tpg = dest_tpg; + tidh_new->dest_node_acl = dest_node_acl; + tidh_new->dest_se_deve = dest_se_deve; + + /* + * Allocate, but do NOT add the registration for the + * TransportID referenced SCSI Initiator port. This + * done because of the following from spc4r17 in section + * 6.14.3 wrt SPEC_I_PT: + * + * "If a registration fails for any initiator port (e.g., if th + * logical unit does not have enough resources available to + * hold the registration information), no registrations shall be + * made, and the command shall be terminated with + * CHECK CONDITION status." + * + * That means we call __core_scsi3_alloc_registration() here, + * and then call __core_scsi3_add_registration() in the + * 2nd loop which will never fail. + */ + dest_pr_reg = __core_scsi3_alloc_registration(SE_DEV(cmd), + dest_node_acl, dest_se_deve, iport_ptr, + sa_res_key, all_tg_pt, aptpl); + if (!(dest_pr_reg)) { + core_scsi3_lunacl_undepend_item(dest_se_deve); + core_scsi3_nodeacl_undepend_item(dest_node_acl); + core_scsi3_tpg_undepend_item(dest_tpg); + kfree(tidh_new); + ret = PYX_TRANSPORT_INVALID_PARAMETER_LIST; + goto out; + } + tidh_new->dest_pr_reg = dest_pr_reg; + list_add_tail(&tidh_new->dest_list, &tid_dest_list); + + ptr += tid_len; + tpdl -= tid_len; + tid_len = 0; + + } + /* + * Go ahead and create a registrations from tid_dest_list for the + * SPEC_I_PT provided TransportID for the *tidh referenced dest_node_acl + * and dest_se_deve. + * + * The SA Reservation Key from the PROUT is set for the + * registration, and ALL_TG_PT is also passed. ALL_TG_PT=1 + * means that the TransportID Initiator port will be + * registered on all of the target ports in the SCSI target device + * ALL_TG_PT=0 means the registration will only be for the + * SCSI target port the PROUT REGISTER with SPEC_I_PT=1 + * was received. + */ + list_for_each_entry_safe(tidh, tidh_tmp, &tid_dest_list, dest_list) { + dest_tpg = tidh->dest_tpg; + dest_node_acl = tidh->dest_node_acl; + dest_se_deve = tidh->dest_se_deve; + dest_pr_reg = tidh->dest_pr_reg; + dest_local_nexus = tidh->dest_local_nexus; + + list_del(&tidh->dest_list); + kfree(tidh); + + memset(i_buf, 0, PR_REG_ISID_ID_LEN); + prf_isid = core_pr_dump_initiator_port(dest_pr_reg, &i_buf[0], + PR_REG_ISID_ID_LEN); + + __core_scsi3_add_registration(SE_DEV(cmd), dest_node_acl, + dest_pr_reg, 0, 0); + + printk(KERN_INFO "SPC-3 PR [%s] SPEC_I_PT: Successfully" + " registered Transport ID for Node: %s%s Mapped LUN:" + " %u\n", TPG_TFO(dest_tpg)->get_fabric_name(), + dest_node_acl->initiatorname, (prf_isid) ? + &i_buf[0] : "", dest_se_deve->mapped_lun); + + if (dest_local_nexus) + continue; + + core_scsi3_lunacl_undepend_item(dest_se_deve); + core_scsi3_nodeacl_undepend_item(dest_node_acl); + core_scsi3_tpg_undepend_item(dest_tpg); + } + + return 0; +out: + /* + * For the failure case, release everything from tid_dest_list + * including *dest_pr_reg and the configfs dependances.. + */ + list_for_each_entry_safe(tidh, tidh_tmp, &tid_dest_list, dest_list) { + dest_tpg = tidh->dest_tpg; + dest_node_acl = tidh->dest_node_acl; + dest_se_deve = tidh->dest_se_deve; + dest_pr_reg = tidh->dest_pr_reg; + dest_local_nexus = tidh->dest_local_nexus; + + list_del(&tidh->dest_list); + kfree(tidh); + /* + * Release any extra ALL_TG_PT=1 registrations for + * the SPEC_I_PT=1 case. + */ + list_for_each_entry_safe(pr_reg_tmp, pr_reg_tmp_safe, + &dest_pr_reg->pr_reg_atp_list, + pr_reg_atp_mem_list) { + list_del(&pr_reg_tmp->pr_reg_atp_mem_list); + core_scsi3_lunacl_undepend_item(pr_reg_tmp->pr_reg_deve); + kmem_cache_free(t10_pr_reg_cache, pr_reg_tmp); + } + + kfree(dest_pr_reg->pr_aptpl_buf); + kmem_cache_free(t10_pr_reg_cache, dest_pr_reg); + + if (dest_local_nexus) + continue; + + core_scsi3_lunacl_undepend_item(dest_se_deve); + core_scsi3_nodeacl_undepend_item(dest_node_acl); + core_scsi3_tpg_undepend_item(dest_tpg); + } + return ret; +} + +/* + * Called with struct se_device->dev_reservation_lock held + */ +static int __core_scsi3_update_aptpl_buf( + struct se_device *dev, + unsigned char *buf, + u32 pr_aptpl_buf_len, + int clear_aptpl_metadata) +{ + struct se_lun *lun; + struct se_portal_group *tpg; + struct se_subsystem_dev *su_dev = SU_DEV(dev); + struct t10_pr_registration *pr_reg; + unsigned char tmp[512], isid_buf[32]; + ssize_t len = 0; + int reg_count = 0; + + memset(buf, 0, pr_aptpl_buf_len); + /* + * Called to clear metadata once APTPL has been deactivated. + */ + if (clear_aptpl_metadata) { + snprintf(buf, pr_aptpl_buf_len, + "No Registrations or Reservations\n"); + return 0; + } + /* + * Walk the registration list.. + */ + spin_lock(&T10_RES(su_dev)->registration_lock); + list_for_each_entry(pr_reg, &T10_RES(su_dev)->registration_list, + pr_reg_list) { + + tmp[0] = '\0'; + isid_buf[0] = '\0'; + tpg = pr_reg->pr_reg_nacl->se_tpg; + lun = pr_reg->pr_reg_tg_pt_lun; + /* + * Write out any ISID value to APTPL metadata that was included + * in the original registration. + */ + if (pr_reg->isid_present_at_reg) + snprintf(isid_buf, 32, "initiator_sid=%s\n", + pr_reg->pr_reg_isid); + /* + * Include special metadata if the pr_reg matches the + * reservation holder. + */ + if (dev->dev_pr_res_holder == pr_reg) { + snprintf(tmp, 512, "PR_REG_START: %d" + "\ninitiator_fabric=%s\n" + "initiator_node=%s\n%s" + "sa_res_key=%llu\n" + "res_holder=1\nres_type=%02x\n" + "res_scope=%02x\nres_all_tg_pt=%d\n" + "mapped_lun=%u\n", reg_count, + TPG_TFO(tpg)->get_fabric_name(), + pr_reg->pr_reg_nacl->initiatorname, isid_buf, + pr_reg->pr_res_key, pr_reg->pr_res_type, + pr_reg->pr_res_scope, pr_reg->pr_reg_all_tg_pt, + pr_reg->pr_res_mapped_lun); + } else { + snprintf(tmp, 512, "PR_REG_START: %d\n" + "initiator_fabric=%s\ninitiator_node=%s\n%s" + "sa_res_key=%llu\nres_holder=0\n" + "res_all_tg_pt=%d\nmapped_lun=%u\n", + reg_count, TPG_TFO(tpg)->get_fabric_name(), + pr_reg->pr_reg_nacl->initiatorname, isid_buf, + pr_reg->pr_res_key, pr_reg->pr_reg_all_tg_pt, + pr_reg->pr_res_mapped_lun); + } + + if ((len + strlen(tmp) > pr_aptpl_buf_len)) { + printk(KERN_ERR "Unable to update renaming" + " APTPL metadata\n"); + spin_unlock(&T10_RES(su_dev)->registration_lock); + return -1; + } + len += sprintf(buf+len, "%s", tmp); + + /* + * Include information about the associated SCSI target port. + */ + snprintf(tmp, 512, "target_fabric=%s\ntarget_node=%s\n" + "tpgt=%hu\nport_rtpi=%hu\ntarget_lun=%u\nPR_REG_END:" + " %d\n", TPG_TFO(tpg)->get_fabric_name(), + TPG_TFO(tpg)->tpg_get_wwn(tpg), + TPG_TFO(tpg)->tpg_get_tag(tpg), + lun->lun_sep->sep_rtpi, lun->unpacked_lun, reg_count); + + if ((len + strlen(tmp) > pr_aptpl_buf_len)) { + printk(KERN_ERR "Unable to update renaming" + " APTPL metadata\n"); + spin_unlock(&T10_RES(su_dev)->registration_lock); + return -1; + } + len += sprintf(buf+len, "%s", tmp); + reg_count++; + } + spin_unlock(&T10_RES(su_dev)->registration_lock); + + if (!(reg_count)) + len += sprintf(buf+len, "No Registrations or Reservations"); + + return 0; +} + +static int core_scsi3_update_aptpl_buf( + struct se_device *dev, + unsigned char *buf, + u32 pr_aptpl_buf_len, + int clear_aptpl_metadata) +{ + int ret; + + spin_lock(&dev->dev_reservation_lock); + ret = __core_scsi3_update_aptpl_buf(dev, buf, pr_aptpl_buf_len, + clear_aptpl_metadata); + spin_unlock(&dev->dev_reservation_lock); + + return ret; +} + +/* + * Called with struct se_device->aptpl_file_mutex held + */ +static int __core_scsi3_write_aptpl_to_file( + struct se_device *dev, + unsigned char *buf, + u32 pr_aptpl_buf_len) +{ + struct t10_wwn *wwn = &SU_DEV(dev)->t10_wwn; + struct file *file; + struct iovec iov[1]; + mm_segment_t old_fs; + int flags = O_RDWR | O_CREAT | O_TRUNC; + char path[512]; + int ret; + + memset(iov, 0, sizeof(struct iovec)); + memset(path, 0, 512); + + if (strlen(&wwn->unit_serial[0]) > 512) { + printk(KERN_ERR "WWN value for struct se_device does not fit" + " into path buffer\n"); + return -1; + } + + snprintf(path, 512, "/var/target/pr/aptpl_%s", &wwn->unit_serial[0]); + file = filp_open(path, flags, 0600); + if (IS_ERR(file) || !file || !file->f_dentry) { + printk(KERN_ERR "filp_open(%s) for APTPL metadata" + " failed\n", path); + return -1; + } + + iov[0].iov_base = &buf[0]; + if (!(pr_aptpl_buf_len)) + iov[0].iov_len = (strlen(&buf[0]) + 1); /* Add extra for NULL */ + else + iov[0].iov_len = pr_aptpl_buf_len; + + old_fs = get_fs(); + set_fs(get_ds()); + ret = vfs_writev(file, &iov[0], 1, &file->f_pos); + set_fs(old_fs); + + if (ret < 0) { + printk("Error writing APTPL metadata file: %s\n", path); + filp_close(file, NULL); + return -1; + } + filp_close(file, NULL); + + return 0; +} + +static int core_scsi3_update_and_write_aptpl( + struct se_device *dev, + unsigned char *in_buf, + u32 in_pr_aptpl_buf_len) +{ + unsigned char null_buf[64], *buf; + u32 pr_aptpl_buf_len; + int ret, clear_aptpl_metadata = 0; + /* + * Can be called with a NULL pointer from PROUT service action CLEAR + */ + if (!(in_buf)) { + memset(null_buf, 0, 64); + buf = &null_buf[0]; + /* + * This will clear the APTPL metadata to: + * "No Registrations or Reservations" status + */ + pr_aptpl_buf_len = 64; + clear_aptpl_metadata = 1; + } else { + buf = in_buf; + pr_aptpl_buf_len = in_pr_aptpl_buf_len; + } + + ret = core_scsi3_update_aptpl_buf(dev, buf, pr_aptpl_buf_len, + clear_aptpl_metadata); + if (ret != 0) + return -1; + /* + * __core_scsi3_write_aptpl_to_file() will call strlen() + * on the passed buf to determine pr_aptpl_buf_len. + */ + ret = __core_scsi3_write_aptpl_to_file(dev, buf, 0); + if (ret != 0) + return -1; + + return ret; +} + +static int core_scsi3_emulate_pro_register( + struct se_cmd *cmd, + u64 res_key, + u64 sa_res_key, + int aptpl, + int all_tg_pt, + int spec_i_pt, + int ignore_key) +{ + struct se_session *se_sess = SE_SESS(cmd); + struct se_device *dev = SE_DEV(cmd); + struct se_dev_entry *se_deve; + struct se_lun *se_lun = SE_LUN(cmd); + struct se_portal_group *se_tpg; + struct t10_pr_registration *pr_reg, *pr_reg_p, *pr_reg_tmp, *pr_reg_e; + struct t10_reservation_template *pr_tmpl = &SU_DEV(dev)->t10_reservation; + /* Used for APTPL metadata w/ UNREGISTER */ + unsigned char *pr_aptpl_buf = NULL; + unsigned char isid_buf[PR_REG_ISID_LEN], *isid_ptr = NULL; + int pr_holder = 0, ret = 0, type; + + if (!(se_sess) || !(se_lun)) { + printk(KERN_ERR "SPC-3 PR: se_sess || struct se_lun is NULL!\n"); + return PYX_TRANSPORT_LU_COMM_FAILURE; + } + se_tpg = se_sess->se_tpg; + se_deve = &se_sess->se_node_acl->device_list[cmd->orig_fe_lun]; + + if (TPG_TFO(se_tpg)->sess_get_initiator_sid != NULL) { + memset(&isid_buf[0], 0, PR_REG_ISID_LEN); + TPG_TFO(se_tpg)->sess_get_initiator_sid(se_sess, &isid_buf[0], + PR_REG_ISID_LEN); + isid_ptr = &isid_buf[0]; + } + /* + * Follow logic from spc4r17 Section 5.7.7, Register Behaviors Table 47 + */ + pr_reg_e = core_scsi3_locate_pr_reg(dev, se_sess->se_node_acl, se_sess); + if (!(pr_reg_e)) { + if (res_key) { + printk(KERN_WARNING "SPC-3 PR: Reservation Key non-zero" + " for SA REGISTER, returning CONFLICT\n"); + return PYX_TRANSPORT_RESERVATION_CONFLICT; + } + /* + * Do nothing but return GOOD status. + */ + if (!(sa_res_key)) + return PYX_TRANSPORT_SENT_TO_TRANSPORT; + + if (!(spec_i_pt)) { + /* + * Perform the Service Action REGISTER on the Initiator + * Port Endpoint that the PRO was received from on the + * Logical Unit of the SCSI device server. + */ + ret = core_scsi3_alloc_registration(SE_DEV(cmd), + se_sess->se_node_acl, se_deve, isid_ptr, + sa_res_key, all_tg_pt, aptpl, + ignore_key, 0); + if (ret != 0) { + printk(KERN_ERR "Unable to allocate" + " struct t10_pr_registration\n"); + return PYX_TRANSPORT_INVALID_PARAMETER_LIST; + } + } else { + /* + * Register both the Initiator port that received + * PROUT SA REGISTER + SPEC_I_PT=1 and extract SCSI + * TransportID from Parameter list and loop through + * fabric dependent parameter list while calling + * logic from of core_scsi3_alloc_registration() for + * each TransportID provided SCSI Initiator Port/Device + */ + ret = core_scsi3_decode_spec_i_port(cmd, se_tpg, + isid_ptr, sa_res_key, all_tg_pt, aptpl); + if (ret != 0) + return ret; + } + /* + * Nothing left to do for the APTPL=0 case. + */ + if (!(aptpl)) { + pr_tmpl->pr_aptpl_active = 0; + core_scsi3_update_and_write_aptpl(SE_DEV(cmd), NULL, 0); + printk("SPC-3 PR: Set APTPL Bit Deactivated for" + " REGISTER\n"); + return 0; + } + /* + * Locate the newly allocated local I_T Nexus *pr_reg, and + * update the APTPL metadata information using its + * preallocated *pr_reg->pr_aptpl_buf. + */ + pr_reg = core_scsi3_locate_pr_reg(SE_DEV(cmd), + se_sess->se_node_acl, se_sess); + + ret = core_scsi3_update_and_write_aptpl(SE_DEV(cmd), + &pr_reg->pr_aptpl_buf[0], + pr_tmpl->pr_aptpl_buf_len); + if (!(ret)) { + pr_tmpl->pr_aptpl_active = 1; + printk("SPC-3 PR: Set APTPL Bit Activated for REGISTER\n"); + } + + core_scsi3_put_pr_reg(pr_reg); + return ret; + } else { + /* + * Locate the existing *pr_reg via struct se_node_acl pointers + */ + pr_reg = pr_reg_e; + type = pr_reg->pr_res_type; + + if (!(ignore_key)) { + if (res_key != pr_reg->pr_res_key) { + printk(KERN_ERR "SPC-3 PR REGISTER: Received" + " res_key: 0x%016Lx does not match" + " existing SA REGISTER res_key:" + " 0x%016Lx\n", res_key, + pr_reg->pr_res_key); + core_scsi3_put_pr_reg(pr_reg); + return PYX_TRANSPORT_RESERVATION_CONFLICT; + } + } + if (spec_i_pt) { + printk(KERN_ERR "SPC-3 PR UNREGISTER: SPEC_I_PT" + " set while sa_res_key=0\n"); + core_scsi3_put_pr_reg(pr_reg); + return PYX_TRANSPORT_INVALID_PARAMETER_LIST; + } + /* + * An existing ALL_TG_PT=1 registration being released + * must also set ALL_TG_PT=1 in the incoming PROUT. + */ + if (pr_reg->pr_reg_all_tg_pt && !(all_tg_pt)) { + printk(KERN_ERR "SPC-3 PR UNREGISTER: ALL_TG_PT=1" + " registration exists, but ALL_TG_PT=1 bit not" + " present in received PROUT\n"); + core_scsi3_put_pr_reg(pr_reg); + return PYX_TRANSPORT_INVALID_CDB_FIELD; + } + /* + * Allocate APTPL metadata buffer used for UNREGISTER ops + */ + if (aptpl) { + pr_aptpl_buf = kzalloc(pr_tmpl->pr_aptpl_buf_len, + GFP_KERNEL); + if (!(pr_aptpl_buf)) { + printk(KERN_ERR "Unable to allocate" + " pr_aptpl_buf\n"); + core_scsi3_put_pr_reg(pr_reg); + return PYX_TRANSPORT_LU_COMM_FAILURE; + } + } + /* + * sa_res_key=0 Unregister Reservation Key for registered I_T + * Nexus sa_res_key=1 Change Reservation Key for registered I_T + * Nexus. + */ + if (!(sa_res_key)) { + pr_holder = core_scsi3_check_implict_release( + SE_DEV(cmd), pr_reg); + if (pr_holder < 0) { + kfree(pr_aptpl_buf); + core_scsi3_put_pr_reg(pr_reg); + return PYX_TRANSPORT_RESERVATION_CONFLICT; + } + + spin_lock(&pr_tmpl->registration_lock); + /* + * Release all ALL_TG_PT=1 for the matching SCSI Initiator Port + * and matching pr_res_key. + */ + if (pr_reg->pr_reg_all_tg_pt) { + list_for_each_entry_safe(pr_reg_p, pr_reg_tmp, + &pr_tmpl->registration_list, + pr_reg_list) { + + if (!(pr_reg_p->pr_reg_all_tg_pt)) + continue; + + if (pr_reg_p->pr_res_key != res_key) + continue; + + if (pr_reg == pr_reg_p) + continue; + + if (strcmp(pr_reg->pr_reg_nacl->initiatorname, + pr_reg_p->pr_reg_nacl->initiatorname)) + continue; + + __core_scsi3_free_registration(dev, + pr_reg_p, NULL, 0); + } + } + /* + * Release the calling I_T Nexus registration now.. + */ + __core_scsi3_free_registration(SE_DEV(cmd), pr_reg, + NULL, 1); + /* + * From spc4r17, section 5.7.11.3 Unregistering + * + * If the persistent reservation is a registrants only + * type, the device server shall establish a unit + * attention condition for the initiator port associated + * with every registered I_T nexus except for the I_T + * nexus on which the PERSISTENT RESERVE OUT command was + * received, with the additional sense code set to + * RESERVATIONS RELEASED. + */ + if (pr_holder && + ((type == PR_TYPE_WRITE_EXCLUSIVE_REGONLY) || + (type == PR_TYPE_EXCLUSIVE_ACCESS_REGONLY))) { + list_for_each_entry(pr_reg_p, + &pr_tmpl->registration_list, + pr_reg_list) { + + core_scsi3_ua_allocate( + pr_reg_p->pr_reg_nacl, + pr_reg_p->pr_res_mapped_lun, + 0x2A, + ASCQ_2AH_RESERVATIONS_RELEASED); + } + } + spin_unlock(&pr_tmpl->registration_lock); + + if (!(aptpl)) { + pr_tmpl->pr_aptpl_active = 0; + core_scsi3_update_and_write_aptpl(dev, NULL, 0); + printk("SPC-3 PR: Set APTPL Bit Deactivated" + " for UNREGISTER\n"); + return 0; + } + + ret = core_scsi3_update_and_write_aptpl(dev, + &pr_aptpl_buf[0], + pr_tmpl->pr_aptpl_buf_len); + if (!(ret)) { + pr_tmpl->pr_aptpl_active = 1; + printk("SPC-3 PR: Set APTPL Bit Activated" + " for UNREGISTER\n"); + } + + kfree(pr_aptpl_buf); + return ret; + } else { + /* + * Increment PRgeneration counter for struct se_device" + * upon a successful REGISTER, see spc4r17 section 6.3.2 + * READ_KEYS service action. + */ + pr_reg->pr_res_generation = core_scsi3_pr_generation( + SE_DEV(cmd)); + pr_reg->pr_res_key = sa_res_key; + printk("SPC-3 PR [%s] REGISTER%s: Changed Reservation" + " Key for %s to: 0x%016Lx PRgeneration:" + " 0x%08x\n", CMD_TFO(cmd)->get_fabric_name(), + (ignore_key) ? "_AND_IGNORE_EXISTING_KEY" : "", + pr_reg->pr_reg_nacl->initiatorname, + pr_reg->pr_res_key, pr_reg->pr_res_generation); + + if (!(aptpl)) { + pr_tmpl->pr_aptpl_active = 0; + core_scsi3_update_and_write_aptpl(dev, NULL, 0); + core_scsi3_put_pr_reg(pr_reg); + printk("SPC-3 PR: Set APTPL Bit Deactivated" + " for REGISTER\n"); + return 0; + } + + ret = core_scsi3_update_and_write_aptpl(dev, + &pr_aptpl_buf[0], + pr_tmpl->pr_aptpl_buf_len); + if (!(ret)) { + pr_tmpl->pr_aptpl_active = 1; + printk("SPC-3 PR: Set APTPL Bit Activated" + " for REGISTER\n"); + } + + kfree(pr_aptpl_buf); + core_scsi3_put_pr_reg(pr_reg); + } + } + return 0; +} + +unsigned char *core_scsi3_pr_dump_type(int type) +{ + switch (type) { + case PR_TYPE_WRITE_EXCLUSIVE: + return "Write Exclusive Access"; + case PR_TYPE_EXCLUSIVE_ACCESS: + return "Exclusive Access"; + case PR_TYPE_WRITE_EXCLUSIVE_REGONLY: + return "Write Exclusive Access, Registrants Only"; + case PR_TYPE_EXCLUSIVE_ACCESS_REGONLY: + return "Exclusive Access, Registrants Only"; + case PR_TYPE_WRITE_EXCLUSIVE_ALLREG: + return "Write Exclusive Access, All Registrants"; + case PR_TYPE_EXCLUSIVE_ACCESS_ALLREG: + return "Exclusive Access, All Registrants"; + default: + break; + } + + return "Unknown SPC-3 PR Type"; +} + +static int core_scsi3_pro_reserve( + struct se_cmd *cmd, + struct se_device *dev, + int type, + int scope, + u64 res_key) +{ + struct se_session *se_sess = SE_SESS(cmd); + struct se_dev_entry *se_deve; + struct se_lun *se_lun = SE_LUN(cmd); + struct se_portal_group *se_tpg; + struct t10_pr_registration *pr_reg, *pr_res_holder; + struct t10_reservation_template *pr_tmpl = &SU_DEV(dev)->t10_reservation; + char i_buf[PR_REG_ISID_ID_LEN]; + int ret, prf_isid; + + memset(i_buf, 0, PR_REG_ISID_ID_LEN); + + if (!(se_sess) || !(se_lun)) { + printk(KERN_ERR "SPC-3 PR: se_sess || struct se_lun is NULL!\n"); + return PYX_TRANSPORT_LU_COMM_FAILURE; + } + se_tpg = se_sess->se_tpg; + se_deve = &se_sess->se_node_acl->device_list[cmd->orig_fe_lun]; + /* + * Locate the existing *pr_reg via struct se_node_acl pointers + */ + pr_reg = core_scsi3_locate_pr_reg(SE_DEV(cmd), se_sess->se_node_acl, + se_sess); + if (!(pr_reg)) { + printk(KERN_ERR "SPC-3 PR: Unable to locate" + " PR_REGISTERED *pr_reg for RESERVE\n"); + return PYX_TRANSPORT_LU_COMM_FAILURE; + } + /* + * From spc4r17 Section 5.7.9: Reserving: + * + * An application client creates a persistent reservation by issuing + * a PERSISTENT RESERVE OUT command with RESERVE service action through + * a registered I_T nexus with the following parameters: + * a) RESERVATION KEY set to the value of the reservation key that is + * registered with the logical unit for the I_T nexus; and + */ + if (res_key != pr_reg->pr_res_key) { + printk(KERN_ERR "SPC-3 PR RESERVE: Received res_key: 0x%016Lx" + " does not match existing SA REGISTER res_key:" + " 0x%016Lx\n", res_key, pr_reg->pr_res_key); + core_scsi3_put_pr_reg(pr_reg); + return PYX_TRANSPORT_RESERVATION_CONFLICT; + } + /* + * From spc4r17 Section 5.7.9: Reserving: + * + * From above: + * b) TYPE field and SCOPE field set to the persistent reservation + * being created. + * + * Only one persistent reservation is allowed at a time per logical unit + * and that persistent reservation has a scope of LU_SCOPE. + */ + if (scope != PR_SCOPE_LU_SCOPE) { + printk(KERN_ERR "SPC-3 PR: Illegal SCOPE: 0x%02x\n", scope); + core_scsi3_put_pr_reg(pr_reg); + return PYX_TRANSPORT_INVALID_PARAMETER_LIST; + } + /* + * See if we have an existing PR reservation holder pointer at + * struct se_device->dev_pr_res_holder in the form struct t10_pr_registration + * *pr_res_holder. + */ + spin_lock(&dev->dev_reservation_lock); + pr_res_holder = dev->dev_pr_res_holder; + if ((pr_res_holder)) { + /* + * From spc4r17 Section 5.7.9: Reserving: + * + * If the device server receives a PERSISTENT RESERVE OUT + * command from an I_T nexus other than a persistent reservation + * holder (see 5.7.10) that attempts to create a persistent + * reservation when a persistent reservation already exists for + * the logical unit, then the command shall be completed with + * RESERVATION CONFLICT status. + */ + if (pr_res_holder != pr_reg) { + struct se_node_acl *pr_res_nacl = pr_res_holder->pr_reg_nacl; + printk(KERN_ERR "SPC-3 PR: Attempted RESERVE from" + " [%s]: %s while reservation already held by" + " [%s]: %s, returning RESERVATION_CONFLICT\n", + CMD_TFO(cmd)->get_fabric_name(), + se_sess->se_node_acl->initiatorname, + TPG_TFO(pr_res_nacl->se_tpg)->get_fabric_name(), + pr_res_holder->pr_reg_nacl->initiatorname); + + spin_unlock(&dev->dev_reservation_lock); + core_scsi3_put_pr_reg(pr_reg); + return PYX_TRANSPORT_RESERVATION_CONFLICT; + } + /* + * From spc4r17 Section 5.7.9: Reserving: + * + * If a persistent reservation holder attempts to modify the + * type or scope of an existing persistent reservation, the + * command shall be completed with RESERVATION CONFLICT status. + */ + if ((pr_res_holder->pr_res_type != type) || + (pr_res_holder->pr_res_scope != scope)) { + struct se_node_acl *pr_res_nacl = pr_res_holder->pr_reg_nacl; + printk(KERN_ERR "SPC-3 PR: Attempted RESERVE from" + " [%s]: %s trying to change TYPE and/or SCOPE," + " while reservation already held by [%s]: %s," + " returning RESERVATION_CONFLICT\n", + CMD_TFO(cmd)->get_fabric_name(), + se_sess->se_node_acl->initiatorname, + TPG_TFO(pr_res_nacl->se_tpg)->get_fabric_name(), + pr_res_holder->pr_reg_nacl->initiatorname); + + spin_unlock(&dev->dev_reservation_lock); + core_scsi3_put_pr_reg(pr_reg); + return PYX_TRANSPORT_RESERVATION_CONFLICT; + } + /* + * From spc4r17 Section 5.7.9: Reserving: + * + * If the device server receives a PERSISTENT RESERVE OUT + * command with RESERVE service action where the TYPE field and + * the SCOPE field contain the same values as the existing type + * and scope from a persistent reservation holder, it shall not + * make any change to the existing persistent reservation and + * shall completethe command with GOOD status. + */ + spin_unlock(&dev->dev_reservation_lock); + core_scsi3_put_pr_reg(pr_reg); + return PYX_TRANSPORT_SENT_TO_TRANSPORT; + } + /* + * Otherwise, our *pr_reg becomes the PR reservation holder for said + * TYPE/SCOPE. Also set the received scope and type in *pr_reg. + */ + pr_reg->pr_res_scope = scope; + pr_reg->pr_res_type = type; + pr_reg->pr_res_holder = 1; + dev->dev_pr_res_holder = pr_reg; + prf_isid = core_pr_dump_initiator_port(pr_reg, &i_buf[0], + PR_REG_ISID_ID_LEN); + + printk(KERN_INFO "SPC-3 PR [%s] Service Action: RESERVE created new" + " reservation holder TYPE: %s ALL_TG_PT: %d\n", + CMD_TFO(cmd)->get_fabric_name(), core_scsi3_pr_dump_type(type), + (pr_reg->pr_reg_all_tg_pt) ? 1 : 0); + printk(KERN_INFO "SPC-3 PR [%s] RESERVE Node: %s%s\n", + CMD_TFO(cmd)->get_fabric_name(), + se_sess->se_node_acl->initiatorname, + (prf_isid) ? &i_buf[0] : ""); + spin_unlock(&dev->dev_reservation_lock); + + if (pr_tmpl->pr_aptpl_active) { + ret = core_scsi3_update_and_write_aptpl(SE_DEV(cmd), + &pr_reg->pr_aptpl_buf[0], + pr_tmpl->pr_aptpl_buf_len); + if (!(ret)) + printk(KERN_INFO "SPC-3 PR: Updated APTPL metadata" + " for RESERVE\n"); + } + + core_scsi3_put_pr_reg(pr_reg); + return 0; +} + +static int core_scsi3_emulate_pro_reserve( + struct se_cmd *cmd, + int type, + int scope, + u64 res_key) +{ + struct se_device *dev = cmd->se_dev; + int ret = 0; + + switch (type) { + case PR_TYPE_WRITE_EXCLUSIVE: + case PR_TYPE_EXCLUSIVE_ACCESS: + case PR_TYPE_WRITE_EXCLUSIVE_REGONLY: + case PR_TYPE_EXCLUSIVE_ACCESS_REGONLY: + case PR_TYPE_WRITE_EXCLUSIVE_ALLREG: + case PR_TYPE_EXCLUSIVE_ACCESS_ALLREG: + ret = core_scsi3_pro_reserve(cmd, dev, type, scope, res_key); + break; + default: + printk(KERN_ERR "SPC-3 PR: Unknown Service Action RESERVE Type:" + " 0x%02x\n", type); + return PYX_TRANSPORT_INVALID_CDB_FIELD; + } + + return ret; +} + +/* + * Called with struct se_device->dev_reservation_lock held. + */ +static void __core_scsi3_complete_pro_release( + struct se_device *dev, + struct se_node_acl *se_nacl, + struct t10_pr_registration *pr_reg, + int explict) +{ + struct target_core_fabric_ops *tfo = se_nacl->se_tpg->se_tpg_tfo; + char i_buf[PR_REG_ISID_ID_LEN]; + int prf_isid; + + memset(i_buf, 0, PR_REG_ISID_ID_LEN); + prf_isid = core_pr_dump_initiator_port(pr_reg, &i_buf[0], + PR_REG_ISID_ID_LEN); + /* + * Go ahead and release the current PR reservation holder. + */ + dev->dev_pr_res_holder = NULL; + + printk(KERN_INFO "SPC-3 PR [%s] Service Action: %s RELEASE cleared" + " reservation holder TYPE: %s ALL_TG_PT: %d\n", + tfo->get_fabric_name(), (explict) ? "explict" : "implict", + core_scsi3_pr_dump_type(pr_reg->pr_res_type), + (pr_reg->pr_reg_all_tg_pt) ? 1 : 0); + printk(KERN_INFO "SPC-3 PR [%s] RELEASE Node: %s%s\n", + tfo->get_fabric_name(), se_nacl->initiatorname, + (prf_isid) ? &i_buf[0] : ""); + /* + * Clear TYPE and SCOPE for the next PROUT Service Action: RESERVE + */ + pr_reg->pr_res_holder = pr_reg->pr_res_type = pr_reg->pr_res_scope = 0; +} + +static int core_scsi3_emulate_pro_release( + struct se_cmd *cmd, + int type, + int scope, + u64 res_key) +{ + struct se_device *dev = cmd->se_dev; + struct se_session *se_sess = SE_SESS(cmd); + struct se_lun *se_lun = SE_LUN(cmd); + struct t10_pr_registration *pr_reg, *pr_reg_p, *pr_res_holder; + struct t10_reservation_template *pr_tmpl = &SU_DEV(dev)->t10_reservation; + int ret, all_reg = 0; + + if (!(se_sess) || !(se_lun)) { + printk(KERN_ERR "SPC-3 PR: se_sess || struct se_lun is NULL!\n"); + return PYX_TRANSPORT_LU_COMM_FAILURE; + } + /* + * Locate the existing *pr_reg via struct se_node_acl pointers + */ + pr_reg = core_scsi3_locate_pr_reg(dev, se_sess->se_node_acl, se_sess); + if (!(pr_reg)) { + printk(KERN_ERR "SPC-3 PR: Unable to locate" + " PR_REGISTERED *pr_reg for RELEASE\n"); + return PYX_TRANSPORT_LU_COMM_FAILURE; + } + /* + * From spc4r17 Section 5.7.11.2 Releasing: + * + * If there is no persistent reservation or in response to a persistent + * reservation release request from a registered I_T nexus that is not a + * persistent reservation holder (see 5.7.10), the device server shall + * do the following: + * + * a) Not release the persistent reservation, if any; + * b) Not remove any registrations; and + * c) Complete the command with GOOD status. + */ + spin_lock(&dev->dev_reservation_lock); + pr_res_holder = dev->dev_pr_res_holder; + if (!(pr_res_holder)) { + /* + * No persistent reservation, return GOOD status. + */ + spin_unlock(&dev->dev_reservation_lock); + core_scsi3_put_pr_reg(pr_reg); + return PYX_TRANSPORT_SENT_TO_TRANSPORT; + } + if ((pr_res_holder->pr_res_type == PR_TYPE_WRITE_EXCLUSIVE_ALLREG) || + (pr_res_holder->pr_res_type == PR_TYPE_EXCLUSIVE_ACCESS_ALLREG)) + all_reg = 1; + + if ((all_reg == 0) && (pr_res_holder != pr_reg)) { + /* + * Non 'All Registrants' PR Type cases.. + * Release request from a registered I_T nexus that is not a + * persistent reservation holder. return GOOD status. + */ + spin_unlock(&dev->dev_reservation_lock); + core_scsi3_put_pr_reg(pr_reg); + return PYX_TRANSPORT_SENT_TO_TRANSPORT; + } + /* + * From spc4r17 Section 5.7.11.2 Releasing: + * + * Only the persistent reservation holder (see 5.7.10) is allowed to + * release a persistent reservation. + * + * An application client releases the persistent reservation by issuing + * a PERSISTENT RESERVE OUT command with RELEASE service action through + * an I_T nexus that is a persistent reservation holder with the + * following parameters: + * + * a) RESERVATION KEY field set to the value of the reservation key + * that is registered with the logical unit for the I_T nexus; + */ + if (res_key != pr_reg->pr_res_key) { + printk(KERN_ERR "SPC-3 PR RELEASE: Received res_key: 0x%016Lx" + " does not match existing SA REGISTER res_key:" + " 0x%016Lx\n", res_key, pr_reg->pr_res_key); + spin_unlock(&dev->dev_reservation_lock); + core_scsi3_put_pr_reg(pr_reg); + return PYX_TRANSPORT_RESERVATION_CONFLICT; + } + /* + * From spc4r17 Section 5.7.11.2 Releasing and above: + * + * b) TYPE field and SCOPE field set to match the persistent + * reservation being released. + */ + if ((pr_res_holder->pr_res_type != type) || + (pr_res_holder->pr_res_scope != scope)) { + struct se_node_acl *pr_res_nacl = pr_res_holder->pr_reg_nacl; + printk(KERN_ERR "SPC-3 PR RELEASE: Attempted to release" + " reservation from [%s]: %s with different TYPE " + "and/or SCOPE while reservation already held by" + " [%s]: %s, returning RESERVATION_CONFLICT\n", + CMD_TFO(cmd)->get_fabric_name(), + se_sess->se_node_acl->initiatorname, + TPG_TFO(pr_res_nacl->se_tpg)->get_fabric_name(), + pr_res_holder->pr_reg_nacl->initiatorname); + + spin_unlock(&dev->dev_reservation_lock); + core_scsi3_put_pr_reg(pr_reg); + return PYX_TRANSPORT_RESERVATION_CONFLICT; + } + /* + * In response to a persistent reservation release request from the + * persistent reservation holder the device server shall perform a + * release by doing the following as an uninterrupted series of actions: + * a) Release the persistent reservation; + * b) Not remove any registration(s); + * c) If the released persistent reservation is a registrants only type + * or all registrants type persistent reservation, + * the device server shall establish a unit attention condition for + * the initiator port associated with every regis- + * tered I_T nexus other than I_T nexus on which the PERSISTENT + * RESERVE OUT command with RELEASE service action was received, + * with the additional sense code set to RESERVATIONS RELEASED; and + * d) If the persistent reservation is of any other type, the device + * server shall not establish a unit attention condition. + */ + __core_scsi3_complete_pro_release(dev, se_sess->se_node_acl, + pr_reg, 1); + + spin_unlock(&dev->dev_reservation_lock); + + if ((type != PR_TYPE_WRITE_EXCLUSIVE_REGONLY) && + (type != PR_TYPE_EXCLUSIVE_ACCESS_REGONLY) && + (type != PR_TYPE_WRITE_EXCLUSIVE_ALLREG) && + (type != PR_TYPE_EXCLUSIVE_ACCESS_ALLREG)) { + /* + * If no UNIT ATTENTION conditions will be established for + * PR_TYPE_WRITE_EXCLUSIVE or PR_TYPE_EXCLUSIVE_ACCESS + * go ahead and check for APTPL=1 update+write below + */ + goto write_aptpl; + } + + spin_lock(&pr_tmpl->registration_lock); + list_for_each_entry(pr_reg_p, &pr_tmpl->registration_list, + pr_reg_list) { + /* + * Do not establish a UNIT ATTENTION condition + * for the calling I_T Nexus + */ + if (pr_reg_p == pr_reg) + continue; + + core_scsi3_ua_allocate(pr_reg_p->pr_reg_nacl, + pr_reg_p->pr_res_mapped_lun, + 0x2A, ASCQ_2AH_RESERVATIONS_RELEASED); + } + spin_unlock(&pr_tmpl->registration_lock); + +write_aptpl: + if (pr_tmpl->pr_aptpl_active) { + ret = core_scsi3_update_and_write_aptpl(SE_DEV(cmd), + &pr_reg->pr_aptpl_buf[0], + pr_tmpl->pr_aptpl_buf_len); + if (!(ret)) + printk("SPC-3 PR: Updated APTPL metadata for RELEASE\n"); + } + + core_scsi3_put_pr_reg(pr_reg); + return 0; +} + +static int core_scsi3_emulate_pro_clear( + struct se_cmd *cmd, + u64 res_key) +{ + struct se_device *dev = cmd->se_dev; + struct se_node_acl *pr_reg_nacl; + struct se_session *se_sess = SE_SESS(cmd); + struct t10_reservation_template *pr_tmpl = &SU_DEV(dev)->t10_reservation; + struct t10_pr_registration *pr_reg, *pr_reg_tmp, *pr_reg_n, *pr_res_holder; + u32 pr_res_mapped_lun = 0; + int calling_it_nexus = 0; + /* + * Locate the existing *pr_reg via struct se_node_acl pointers + */ + pr_reg_n = core_scsi3_locate_pr_reg(SE_DEV(cmd), + se_sess->se_node_acl, se_sess); + if (!(pr_reg_n)) { + printk(KERN_ERR "SPC-3 PR: Unable to locate" + " PR_REGISTERED *pr_reg for CLEAR\n"); + return PYX_TRANSPORT_LU_COMM_FAILURE; + } + /* + * From spc4r17 section 5.7.11.6, Clearing: + * + * Any application client may release the persistent reservation and + * remove all registrations from a device server by issuing a + * PERSISTENT RESERVE OUT command with CLEAR service action through a + * registered I_T nexus with the following parameter: + * + * a) RESERVATION KEY field set to the value of the reservation key + * that is registered with the logical unit for the I_T nexus. + */ + if (res_key != pr_reg_n->pr_res_key) { + printk(KERN_ERR "SPC-3 PR REGISTER: Received" + " res_key: 0x%016Lx does not match" + " existing SA REGISTER res_key:" + " 0x%016Lx\n", res_key, pr_reg_n->pr_res_key); + core_scsi3_put_pr_reg(pr_reg_n); + return PYX_TRANSPORT_RESERVATION_CONFLICT; + } + /* + * a) Release the persistent reservation, if any; + */ + spin_lock(&dev->dev_reservation_lock); + pr_res_holder = dev->dev_pr_res_holder; + if (pr_res_holder) { + struct se_node_acl *pr_res_nacl = pr_res_holder->pr_reg_nacl; + __core_scsi3_complete_pro_release(dev, pr_res_nacl, + pr_res_holder, 0); + } + spin_unlock(&dev->dev_reservation_lock); + /* + * b) Remove all registration(s) (see spc4r17 5.7.7); + */ + spin_lock(&pr_tmpl->registration_lock); + list_for_each_entry_safe(pr_reg, pr_reg_tmp, + &pr_tmpl->registration_list, pr_reg_list) { + + calling_it_nexus = (pr_reg_n == pr_reg) ? 1 : 0; + pr_reg_nacl = pr_reg->pr_reg_nacl; + pr_res_mapped_lun = pr_reg->pr_res_mapped_lun; + __core_scsi3_free_registration(dev, pr_reg, NULL, + calling_it_nexus); + /* + * e) Establish a unit attention condition for the initiator + * port associated with every registered I_T nexus other + * than the I_T nexus on which the PERSISTENT RESERVE OUT + * command with CLEAR service action was received, with the + * additional sense code set to RESERVATIONS PREEMPTED. + */ + if (!(calling_it_nexus)) + core_scsi3_ua_allocate(pr_reg_nacl, pr_res_mapped_lun, + 0x2A, ASCQ_2AH_RESERVATIONS_PREEMPTED); + } + spin_unlock(&pr_tmpl->registration_lock); + + printk(KERN_INFO "SPC-3 PR [%s] Service Action: CLEAR complete\n", + CMD_TFO(cmd)->get_fabric_name()); + + if (pr_tmpl->pr_aptpl_active) { + core_scsi3_update_and_write_aptpl(SE_DEV(cmd), NULL, 0); + printk(KERN_INFO "SPC-3 PR: Updated APTPL metadata" + " for CLEAR\n"); + } + + core_scsi3_pr_generation(dev); + return 0; +} + +/* + * Called with struct se_device->dev_reservation_lock held. + */ +static void __core_scsi3_complete_pro_preempt( + struct se_device *dev, + struct t10_pr_registration *pr_reg, + struct list_head *preempt_and_abort_list, + int type, + int scope, + int abort) +{ + struct se_node_acl *nacl = pr_reg->pr_reg_nacl; + struct target_core_fabric_ops *tfo = nacl->se_tpg->se_tpg_tfo; + char i_buf[PR_REG_ISID_ID_LEN]; + int prf_isid; + + memset(i_buf, 0, PR_REG_ISID_ID_LEN); + prf_isid = core_pr_dump_initiator_port(pr_reg, &i_buf[0], + PR_REG_ISID_ID_LEN); + /* + * Do an implict RELEASE of the existing reservation. + */ + if (dev->dev_pr_res_holder) + __core_scsi3_complete_pro_release(dev, nacl, + dev->dev_pr_res_holder, 0); + + dev->dev_pr_res_holder = pr_reg; + pr_reg->pr_res_holder = 1; + pr_reg->pr_res_type = type; + pr_reg->pr_res_scope = scope; + + printk(KERN_INFO "SPC-3 PR [%s] Service Action: PREEMPT%s created new" + " reservation holder TYPE: %s ALL_TG_PT: %d\n", + tfo->get_fabric_name(), (abort) ? "_AND_ABORT" : "", + core_scsi3_pr_dump_type(type), + (pr_reg->pr_reg_all_tg_pt) ? 1 : 0); + printk(KERN_INFO "SPC-3 PR [%s] PREEMPT%s from Node: %s%s\n", + tfo->get_fabric_name(), (abort) ? "_AND_ABORT" : "", + nacl->initiatorname, (prf_isid) ? &i_buf[0] : ""); + /* + * For PREEMPT_AND_ABORT, add the preempting reservation's + * struct t10_pr_registration to the list that will be compared + * against received CDBs.. + */ + if (preempt_and_abort_list) + list_add_tail(&pr_reg->pr_reg_abort_list, + preempt_and_abort_list); +} + +static void core_scsi3_release_preempt_and_abort( + struct list_head *preempt_and_abort_list, + struct t10_pr_registration *pr_reg_holder) +{ + struct t10_pr_registration *pr_reg, *pr_reg_tmp; + + list_for_each_entry_safe(pr_reg, pr_reg_tmp, preempt_and_abort_list, + pr_reg_abort_list) { + + list_del(&pr_reg->pr_reg_abort_list); + if (pr_reg_holder == pr_reg) + continue; + if (pr_reg->pr_res_holder) { + printk(KERN_WARNING "pr_reg->pr_res_holder still set\n"); + continue; + } + + pr_reg->pr_reg_deve = NULL; + pr_reg->pr_reg_nacl = NULL; + kfree(pr_reg->pr_aptpl_buf); + kmem_cache_free(t10_pr_reg_cache, pr_reg); + } +} + +int core_scsi3_check_cdb_abort_and_preempt( + struct list_head *preempt_and_abort_list, + struct se_cmd *cmd) +{ + struct t10_pr_registration *pr_reg, *pr_reg_tmp; + + list_for_each_entry_safe(pr_reg, pr_reg_tmp, preempt_and_abort_list, + pr_reg_abort_list) { + if (pr_reg->pr_res_key == cmd->pr_res_key) + return 0; + } + + return 1; +} + +static int core_scsi3_pro_preempt( + struct se_cmd *cmd, + int type, + int scope, + u64 res_key, + u64 sa_res_key, + int abort) +{ + struct se_device *dev = SE_DEV(cmd); + struct se_dev_entry *se_deve; + struct se_node_acl *pr_reg_nacl; + struct se_session *se_sess = SE_SESS(cmd); + struct list_head preempt_and_abort_list; + struct t10_pr_registration *pr_reg, *pr_reg_tmp, *pr_reg_n, *pr_res_holder; + struct t10_reservation_template *pr_tmpl = &SU_DEV(dev)->t10_reservation; + u32 pr_res_mapped_lun = 0; + int all_reg = 0, calling_it_nexus = 0, released_regs = 0; + int prh_type = 0, prh_scope = 0, ret; + + if (!(se_sess)) + return PYX_TRANSPORT_LU_COMM_FAILURE; + + se_deve = &se_sess->se_node_acl->device_list[cmd->orig_fe_lun]; + pr_reg_n = core_scsi3_locate_pr_reg(SE_DEV(cmd), se_sess->se_node_acl, + se_sess); + if (!(pr_reg_n)) { + printk(KERN_ERR "SPC-3 PR: Unable to locate" + " PR_REGISTERED *pr_reg for PREEMPT%s\n", + (abort) ? "_AND_ABORT" : ""); + return PYX_TRANSPORT_RESERVATION_CONFLICT; + } + if (pr_reg_n->pr_res_key != res_key) { + core_scsi3_put_pr_reg(pr_reg_n); + return PYX_TRANSPORT_RESERVATION_CONFLICT; + } + if (scope != PR_SCOPE_LU_SCOPE) { + printk(KERN_ERR "SPC-3 PR: Illegal SCOPE: 0x%02x\n", scope); + core_scsi3_put_pr_reg(pr_reg_n); + return PYX_TRANSPORT_INVALID_PARAMETER_LIST; + } + INIT_LIST_HEAD(&preempt_and_abort_list); + + spin_lock(&dev->dev_reservation_lock); + pr_res_holder = dev->dev_pr_res_holder; + if (pr_res_holder && + ((pr_res_holder->pr_res_type == PR_TYPE_WRITE_EXCLUSIVE_ALLREG) || + (pr_res_holder->pr_res_type == PR_TYPE_EXCLUSIVE_ACCESS_ALLREG))) + all_reg = 1; + + if (!(all_reg) && !(sa_res_key)) { + spin_unlock(&dev->dev_reservation_lock); + core_scsi3_put_pr_reg(pr_reg_n); + return PYX_TRANSPORT_INVALID_PARAMETER_LIST; + } + /* + * From spc4r17, section 5.7.11.4.4 Removing Registrations: + * + * If the SERVICE ACTION RESERVATION KEY field does not identify a + * persistent reservation holder or there is no persistent reservation + * holder (i.e., there is no persistent reservation), then the device + * server shall perform a preempt by doing the following in an + * uninterrupted series of actions. (See below..) + */ + if (!(pr_res_holder) || (pr_res_holder->pr_res_key != sa_res_key)) { + /* + * No existing or SA Reservation Key matching reservations.. + * + * PROUT SA PREEMPT with All Registrant type reservations are + * allowed to be processed without a matching SA Reservation Key + */ + spin_lock(&pr_tmpl->registration_lock); + list_for_each_entry_safe(pr_reg, pr_reg_tmp, + &pr_tmpl->registration_list, pr_reg_list) { + /* + * Removing of registrations in non all registrants + * type reservations without a matching SA reservation + * key. + * + * a) Remove the registrations for all I_T nexuses + * specified by the SERVICE ACTION RESERVATION KEY + * field; + * b) Ignore the contents of the SCOPE and TYPE fields; + * c) Process tasks as defined in 5.7.1; and + * d) Establish a unit attention condition for the + * initiator port associated with every I_T nexus + * that lost its registration other than the I_T + * nexus on which the PERSISTENT RESERVE OUT command + * was received, with the additional sense code set + * to REGISTRATIONS PREEMPTED. + */ + if (!(all_reg)) { + if (pr_reg->pr_res_key != sa_res_key) + continue; + + calling_it_nexus = (pr_reg_n == pr_reg) ? 1 : 0; + pr_reg_nacl = pr_reg->pr_reg_nacl; + pr_res_mapped_lun = pr_reg->pr_res_mapped_lun; + __core_scsi3_free_registration(dev, pr_reg, + (abort) ? &preempt_and_abort_list : + NULL, calling_it_nexus); + released_regs++; + } else { + /* + * Case for any existing all registrants type + * reservation, follow logic in spc4r17 section + * 5.7.11.4 Preempting, Table 52 and Figure 7. + * + * For a ZERO SA Reservation key, release + * all other registrations and do an implict + * release of active persistent reservation. + * + * For a non-ZERO SA Reservation key, only + * release the matching reservation key from + * registrations. + */ + if ((sa_res_key) && + (pr_reg->pr_res_key != sa_res_key)) + continue; + + calling_it_nexus = (pr_reg_n == pr_reg) ? 1 : 0; + if (calling_it_nexus) + continue; + + pr_reg_nacl = pr_reg->pr_reg_nacl; + pr_res_mapped_lun = pr_reg->pr_res_mapped_lun; + __core_scsi3_free_registration(dev, pr_reg, + (abort) ? &preempt_and_abort_list : + NULL, 0); + released_regs++; + } + if (!(calling_it_nexus)) + core_scsi3_ua_allocate(pr_reg_nacl, + pr_res_mapped_lun, 0x2A, + ASCQ_2AH_RESERVATIONS_PREEMPTED); + } + spin_unlock(&pr_tmpl->registration_lock); + /* + * If a PERSISTENT RESERVE OUT with a PREEMPT service action or + * a PREEMPT AND ABORT service action sets the SERVICE ACTION + * RESERVATION KEY field to a value that does not match any + * registered reservation key, then the device server shall + * complete the command with RESERVATION CONFLICT status. + */ + if (!(released_regs)) { + spin_unlock(&dev->dev_reservation_lock); + core_scsi3_put_pr_reg(pr_reg_n); + return PYX_TRANSPORT_RESERVATION_CONFLICT; + } + /* + * For an existing all registrants type reservation + * with a zero SA rservation key, preempt the existing + * reservation with the new PR type and scope. + */ + if (pr_res_holder && all_reg && !(sa_res_key)) { + __core_scsi3_complete_pro_preempt(dev, pr_reg_n, + (abort) ? &preempt_and_abort_list : NULL, + type, scope, abort); + + if (abort) + core_scsi3_release_preempt_and_abort( + &preempt_and_abort_list, pr_reg_n); + } + spin_unlock(&dev->dev_reservation_lock); + + if (pr_tmpl->pr_aptpl_active) { + ret = core_scsi3_update_and_write_aptpl(SE_DEV(cmd), + &pr_reg_n->pr_aptpl_buf[0], + pr_tmpl->pr_aptpl_buf_len); + if (!(ret)) + printk(KERN_INFO "SPC-3 PR: Updated APTPL" + " metadata for PREEMPT%s\n", (abort) ? + "_AND_ABORT" : ""); + } + + core_scsi3_put_pr_reg(pr_reg_n); + core_scsi3_pr_generation(SE_DEV(cmd)); + return 0; + } + /* + * The PREEMPTing SA reservation key matches that of the + * existing persistent reservation, first, we check if + * we are preempting our own reservation. + * From spc4r17, section 5.7.11.4.3 Preempting + * persistent reservations and registration handling + * + * If an all registrants persistent reservation is not + * present, it is not an error for the persistent + * reservation holder to preempt itself (i.e., a + * PERSISTENT RESERVE OUT with a PREEMPT service action + * or a PREEMPT AND ABORT service action with the + * SERVICE ACTION RESERVATION KEY value equal to the + * persistent reservation holder's reservation key that + * is received from the persistent reservation holder). + * In that case, the device server shall establish the + * new persistent reservation and maintain the + * registration. + */ + prh_type = pr_res_holder->pr_res_type; + prh_scope = pr_res_holder->pr_res_scope; + /* + * If the SERVICE ACTION RESERVATION KEY field identifies a + * persistent reservation holder (see 5.7.10), the device + * server shall perform a preempt by doing the following as + * an uninterrupted series of actions: + * + * a) Release the persistent reservation for the holder + * identified by the SERVICE ACTION RESERVATION KEY field; + */ + if (pr_reg_n != pr_res_holder) + __core_scsi3_complete_pro_release(dev, + pr_res_holder->pr_reg_nacl, + dev->dev_pr_res_holder, 0); + /* + * b) Remove the registrations for all I_T nexuses identified + * by the SERVICE ACTION RESERVATION KEY field, except the + * I_T nexus that is being used for the PERSISTENT RESERVE + * OUT command. If an all registrants persistent reservation + * is present and the SERVICE ACTION RESERVATION KEY field + * is set to zero, then all registrations shall be removed + * except for that of the I_T nexus that is being used for + * the PERSISTENT RESERVE OUT command; + */ + spin_lock(&pr_tmpl->registration_lock); + list_for_each_entry_safe(pr_reg, pr_reg_tmp, + &pr_tmpl->registration_list, pr_reg_list) { + + calling_it_nexus = (pr_reg_n == pr_reg) ? 1 : 0; + if (calling_it_nexus) + continue; + + if (pr_reg->pr_res_key != sa_res_key) + continue; + + pr_reg_nacl = pr_reg->pr_reg_nacl; + pr_res_mapped_lun = pr_reg->pr_res_mapped_lun; + __core_scsi3_free_registration(dev, pr_reg, + (abort) ? &preempt_and_abort_list : NULL, + calling_it_nexus); + /* + * e) Establish a unit attention condition for the initiator + * port associated with every I_T nexus that lost its + * persistent reservation and/or registration, with the + * additional sense code set to REGISTRATIONS PREEMPTED; + */ + core_scsi3_ua_allocate(pr_reg_nacl, pr_res_mapped_lun, 0x2A, + ASCQ_2AH_RESERVATIONS_PREEMPTED); + } + spin_unlock(&pr_tmpl->registration_lock); + /* + * c) Establish a persistent reservation for the preempting + * I_T nexus using the contents of the SCOPE and TYPE fields; + */ + __core_scsi3_complete_pro_preempt(dev, pr_reg_n, + (abort) ? &preempt_and_abort_list : NULL, + type, scope, abort); + /* + * d) Process tasks as defined in 5.7.1; + * e) See above.. + * f) If the type or scope has changed, then for every I_T nexus + * whose reservation key was not removed, except for the I_T + * nexus on which the PERSISTENT RESERVE OUT command was + * received, the device server shall establish a unit + * attention condition for the initiator port associated with + * that I_T nexus, with the additional sense code set to + * RESERVATIONS RELEASED. If the type or scope have not + * changed, then no unit attention condition(s) shall be + * established for this reason. + */ + if ((prh_type != type) || (prh_scope != scope)) { + spin_lock(&pr_tmpl->registration_lock); + list_for_each_entry_safe(pr_reg, pr_reg_tmp, + &pr_tmpl->registration_list, pr_reg_list) { + + calling_it_nexus = (pr_reg_n == pr_reg) ? 1 : 0; + if (calling_it_nexus) + continue; + + core_scsi3_ua_allocate(pr_reg->pr_reg_nacl, + pr_reg->pr_res_mapped_lun, 0x2A, + ASCQ_2AH_RESERVATIONS_RELEASED); + } + spin_unlock(&pr_tmpl->registration_lock); + } + spin_unlock(&dev->dev_reservation_lock); + /* + * Call LUN_RESET logic upon list of struct t10_pr_registration, + * All received CDBs for the matching existing reservation and + * registrations undergo ABORT_TASK logic. + * + * From there, core_scsi3_release_preempt_and_abort() will + * release every registration in the list (which have already + * been removed from the primary pr_reg list), except the + * new persistent reservation holder, the calling Initiator Port. + */ + if (abort) { + core_tmr_lun_reset(dev, NULL, &preempt_and_abort_list, cmd); + core_scsi3_release_preempt_and_abort(&preempt_and_abort_list, + pr_reg_n); + } + + if (pr_tmpl->pr_aptpl_active) { + ret = core_scsi3_update_and_write_aptpl(SE_DEV(cmd), + &pr_reg_n->pr_aptpl_buf[0], + pr_tmpl->pr_aptpl_buf_len); + if (!(ret)) + printk("SPC-3 PR: Updated APTPL metadata for PREEMPT" + "%s\n", (abort) ? "_AND_ABORT" : ""); + } + + core_scsi3_put_pr_reg(pr_reg_n); + core_scsi3_pr_generation(SE_DEV(cmd)); + return 0; +} + +static int core_scsi3_emulate_pro_preempt( + struct se_cmd *cmd, + int type, + int scope, + u64 res_key, + u64 sa_res_key, + int abort) +{ + int ret = 0; + + switch (type) { + case PR_TYPE_WRITE_EXCLUSIVE: + case PR_TYPE_EXCLUSIVE_ACCESS: + case PR_TYPE_WRITE_EXCLUSIVE_REGONLY: + case PR_TYPE_EXCLUSIVE_ACCESS_REGONLY: + case PR_TYPE_WRITE_EXCLUSIVE_ALLREG: + case PR_TYPE_EXCLUSIVE_ACCESS_ALLREG: + ret = core_scsi3_pro_preempt(cmd, type, scope, + res_key, sa_res_key, abort); + break; + default: + printk(KERN_ERR "SPC-3 PR: Unknown Service Action PREEMPT%s" + " Type: 0x%02x\n", (abort) ? "_AND_ABORT" : "", type); + return PYX_TRANSPORT_INVALID_CDB_FIELD; + } + + return ret; +} + + +static int core_scsi3_emulate_pro_register_and_move( + struct se_cmd *cmd, + u64 res_key, + u64 sa_res_key, + int aptpl, + int unreg) +{ + struct se_session *se_sess = SE_SESS(cmd); + struct se_device *dev = SE_DEV(cmd); + struct se_dev_entry *se_deve, *dest_se_deve = NULL; + struct se_lun *se_lun = SE_LUN(cmd); + struct se_node_acl *pr_res_nacl, *pr_reg_nacl, *dest_node_acl = NULL; + struct se_port *se_port; + struct se_portal_group *se_tpg, *dest_se_tpg = NULL; + struct target_core_fabric_ops *dest_tf_ops = NULL, *tf_ops; + struct t10_pr_registration *pr_reg, *pr_res_holder, *dest_pr_reg; + struct t10_reservation_template *pr_tmpl = &SU_DEV(dev)->t10_reservation; + unsigned char *buf = (unsigned char *)T_TASK(cmd)->t_task_buf; + unsigned char *initiator_str; + char *iport_ptr = NULL, dest_iport[64], i_buf[PR_REG_ISID_ID_LEN]; + u32 tid_len, tmp_tid_len; + int new_reg = 0, type, scope, ret, matching_iname, prf_isid; + unsigned short rtpi; + unsigned char proto_ident; + + if (!(se_sess) || !(se_lun)) { + printk(KERN_ERR "SPC-3 PR: se_sess || struct se_lun is NULL!\n"); + return PYX_TRANSPORT_LU_COMM_FAILURE; + } + memset(dest_iport, 0, 64); + memset(i_buf, 0, PR_REG_ISID_ID_LEN); + se_tpg = se_sess->se_tpg; + tf_ops = TPG_TFO(se_tpg); + se_deve = &se_sess->se_node_acl->device_list[cmd->orig_fe_lun]; + /* + * Follow logic from spc4r17 Section 5.7.8, Table 50 -- + * Register behaviors for a REGISTER AND MOVE service action + * + * Locate the existing *pr_reg via struct se_node_acl pointers + */ + pr_reg = core_scsi3_locate_pr_reg(SE_DEV(cmd), se_sess->se_node_acl, + se_sess); + if (!(pr_reg)) { + printk(KERN_ERR "SPC-3 PR: Unable to locate PR_REGISTERED" + " *pr_reg for REGISTER_AND_MOVE\n"); + return PYX_TRANSPORT_LU_COMM_FAILURE; + } + /* + * The provided reservation key much match the existing reservation key + * provided during this initiator's I_T nexus registration. + */ + if (res_key != pr_reg->pr_res_key) { + printk(KERN_WARNING "SPC-3 PR REGISTER_AND_MOVE: Received" + " res_key: 0x%016Lx does not match existing SA REGISTER" + " res_key: 0x%016Lx\n", res_key, pr_reg->pr_res_key); + core_scsi3_put_pr_reg(pr_reg); + return PYX_TRANSPORT_RESERVATION_CONFLICT; + } + /* + * The service active reservation key needs to be non zero + */ + if (!(sa_res_key)) { + printk(KERN_WARNING "SPC-3 PR REGISTER_AND_MOVE: Received zero" + " sa_res_key\n"); + core_scsi3_put_pr_reg(pr_reg); + return PYX_TRANSPORT_INVALID_PARAMETER_LIST; + } + /* + * Determine the Relative Target Port Identifier where the reservation + * will be moved to for the TransportID containing SCSI initiator WWN + * information. + */ + rtpi = (buf[18] & 0xff) << 8; + rtpi |= buf[19] & 0xff; + tid_len = (buf[20] & 0xff) << 24; + tid_len |= (buf[21] & 0xff) << 16; + tid_len |= (buf[22] & 0xff) << 8; + tid_len |= buf[23] & 0xff; + + if ((tid_len + 24) != cmd->data_length) { + printk(KERN_ERR "SPC-3 PR: Illegal tid_len: %u + 24 byte header" + " does not equal CDB data_length: %u\n", tid_len, + cmd->data_length); + core_scsi3_put_pr_reg(pr_reg); + return PYX_TRANSPORT_INVALID_PARAMETER_LIST; + } + + spin_lock(&dev->se_port_lock); + list_for_each_entry(se_port, &dev->dev_sep_list, sep_list) { + if (se_port->sep_rtpi != rtpi) + continue; + dest_se_tpg = se_port->sep_tpg; + if (!(dest_se_tpg)) + continue; + dest_tf_ops = TPG_TFO(dest_se_tpg); + if (!(dest_tf_ops)) + continue; + + atomic_inc(&dest_se_tpg->tpg_pr_ref_count); + smp_mb__after_atomic_inc(); + spin_unlock(&dev->se_port_lock); + + ret = core_scsi3_tpg_depend_item(dest_se_tpg); + if (ret != 0) { + printk(KERN_ERR "core_scsi3_tpg_depend_item() failed" + " for dest_se_tpg\n"); + atomic_dec(&dest_se_tpg->tpg_pr_ref_count); + smp_mb__after_atomic_dec(); + core_scsi3_put_pr_reg(pr_reg); + return PYX_TRANSPORT_LU_COMM_FAILURE; + } + + spin_lock(&dev->se_port_lock); + break; + } + spin_unlock(&dev->se_port_lock); + + if (!(dest_se_tpg) || (!dest_tf_ops)) { + printk(KERN_ERR "SPC-3 PR REGISTER_AND_MOVE: Unable to locate" + " fabric ops from Relative Target Port Identifier:" + " %hu\n", rtpi); + core_scsi3_put_pr_reg(pr_reg); + return PYX_TRANSPORT_INVALID_PARAMETER_LIST; + } + proto_ident = (buf[24] & 0x0f); +#if 0 + printk("SPC-3 PR REGISTER_AND_MOVE: Extracted Protocol Identifier:" + " 0x%02x\n", proto_ident); +#endif + if (proto_ident != dest_tf_ops->get_fabric_proto_ident(dest_se_tpg)) { + printk(KERN_ERR "SPC-3 PR REGISTER_AND_MOVE: Received" + " proto_ident: 0x%02x does not match ident: 0x%02x" + " from fabric: %s\n", proto_ident, + dest_tf_ops->get_fabric_proto_ident(dest_se_tpg), + dest_tf_ops->get_fabric_name()); + ret = PYX_TRANSPORT_INVALID_PARAMETER_LIST; + goto out; + } + if (dest_tf_ops->tpg_parse_pr_out_transport_id == NULL) { + printk(KERN_ERR "SPC-3 PR REGISTER_AND_MOVE: Fabric does not" + " containg a valid tpg_parse_pr_out_transport_id" + " function pointer\n"); + ret = PYX_TRANSPORT_LU_COMM_FAILURE; + goto out; + } + initiator_str = dest_tf_ops->tpg_parse_pr_out_transport_id(dest_se_tpg, + (const char *)&buf[24], &tmp_tid_len, &iport_ptr); + if (!(initiator_str)) { + printk(KERN_ERR "SPC-3 PR REGISTER_AND_MOVE: Unable to locate" + " initiator_str from Transport ID\n"); + ret = PYX_TRANSPORT_INVALID_PARAMETER_LIST; + goto out; + } + + printk(KERN_INFO "SPC-3 PR [%s] Extracted initiator %s identifier: %s" + " %s\n", dest_tf_ops->get_fabric_name(), (iport_ptr != NULL) ? + "port" : "device", initiator_str, (iport_ptr != NULL) ? + iport_ptr : ""); + /* + * If a PERSISTENT RESERVE OUT command with a REGISTER AND MOVE service + * action specifies a TransportID that is the same as the initiator port + * of the I_T nexus for the command received, then the command shall + * be terminated with CHECK CONDITION status, with the sense key set to + * ILLEGAL REQUEST, and the additional sense code set to INVALID FIELD + * IN PARAMETER LIST. + */ + pr_reg_nacl = pr_reg->pr_reg_nacl; + matching_iname = (!strcmp(initiator_str, + pr_reg_nacl->initiatorname)) ? 1 : 0; + if (!(matching_iname)) + goto after_iport_check; + + if (!(iport_ptr) || !(pr_reg->isid_present_at_reg)) { + printk(KERN_ERR "SPC-3 PR REGISTER_AND_MOVE: TransportID: %s" + " matches: %s on received I_T Nexus\n", initiator_str, + pr_reg_nacl->initiatorname); + ret = PYX_TRANSPORT_INVALID_PARAMETER_LIST; + goto out; + } + if (!(strcmp(iport_ptr, pr_reg->pr_reg_isid))) { + printk(KERN_ERR "SPC-3 PR REGISTER_AND_MOVE: TransportID: %s %s" + " matches: %s %s on received I_T Nexus\n", + initiator_str, iport_ptr, pr_reg_nacl->initiatorname, + pr_reg->pr_reg_isid); + ret = PYX_TRANSPORT_INVALID_PARAMETER_LIST; + goto out; + } +after_iport_check: + /* + * Locate the destination struct se_node_acl from the received Transport ID + */ + spin_lock_bh(&dest_se_tpg->acl_node_lock); + dest_node_acl = __core_tpg_get_initiator_node_acl(dest_se_tpg, + initiator_str); + if (dest_node_acl) { + atomic_inc(&dest_node_acl->acl_pr_ref_count); + smp_mb__after_atomic_inc(); + } + spin_unlock_bh(&dest_se_tpg->acl_node_lock); + + if (!(dest_node_acl)) { + printk(KERN_ERR "Unable to locate %s dest_node_acl for" + " TransportID%s\n", dest_tf_ops->get_fabric_name(), + initiator_str); + ret = PYX_TRANSPORT_INVALID_PARAMETER_LIST; + goto out; + } + ret = core_scsi3_nodeacl_depend_item(dest_node_acl); + if (ret != 0) { + printk(KERN_ERR "core_scsi3_nodeacl_depend_item() for" + " dest_node_acl\n"); + atomic_dec(&dest_node_acl->acl_pr_ref_count); + smp_mb__after_atomic_dec(); + dest_node_acl = NULL; + ret = PYX_TRANSPORT_LU_COMM_FAILURE; + goto out; + } +#if 0 + printk(KERN_INFO "SPC-3 PR REGISTER_AND_MOVE: Found %s dest_node_acl:" + " %s from TransportID\n", dest_tf_ops->get_fabric_name(), + dest_node_acl->initiatorname); +#endif + /* + * Locate the struct se_dev_entry pointer for the matching RELATIVE TARGET + * PORT IDENTIFIER. + */ + dest_se_deve = core_get_se_deve_from_rtpi(dest_node_acl, rtpi); + if (!(dest_se_deve)) { + printk(KERN_ERR "Unable to locate %s dest_se_deve from RTPI:" + " %hu\n", dest_tf_ops->get_fabric_name(), rtpi); + ret = PYX_TRANSPORT_INVALID_PARAMETER_LIST; + goto out; + } + + ret = core_scsi3_lunacl_depend_item(dest_se_deve); + if (ret < 0) { + printk(KERN_ERR "core_scsi3_lunacl_depend_item() failed\n"); + atomic_dec(&dest_se_deve->pr_ref_count); + smp_mb__after_atomic_dec(); + dest_se_deve = NULL; + ret = PYX_TRANSPORT_LU_COMM_FAILURE; + goto out; + } +#if 0 + printk(KERN_INFO "SPC-3 PR REGISTER_AND_MOVE: Located %s node %s LUN" + " ACL for dest_se_deve->mapped_lun: %u\n", + dest_tf_ops->get_fabric_name(), dest_node_acl->initiatorname, + dest_se_deve->mapped_lun); +#endif + /* + * A persistent reservation needs to already existing in order to + * successfully complete the REGISTER_AND_MOVE service action.. + */ + spin_lock(&dev->dev_reservation_lock); + pr_res_holder = dev->dev_pr_res_holder; + if (!(pr_res_holder)) { + printk(KERN_WARNING "SPC-3 PR REGISTER_AND_MOVE: No reservation" + " currently held\n"); + spin_unlock(&dev->dev_reservation_lock); + ret = PYX_TRANSPORT_INVALID_CDB_FIELD; + goto out; + } + /* + * The received on I_T Nexus must be the reservation holder. + * + * From spc4r17 section 5.7.8 Table 50 -- + * Register behaviors for a REGISTER AND MOVE service action + */ + if (pr_res_holder != pr_reg) { + printk(KERN_WARNING "SPC-3 PR REGISTER_AND_MOVE: Calling I_T" + " Nexus is not reservation holder\n"); + spin_unlock(&dev->dev_reservation_lock); + ret = PYX_TRANSPORT_RESERVATION_CONFLICT; + goto out; + } + /* + * From spc4r17 section 5.7.8: registering and moving reservation + * + * If a PERSISTENT RESERVE OUT command with a REGISTER AND MOVE service + * action is received and the established persistent reservation is a + * Write Exclusive - All Registrants type or Exclusive Access - + * All Registrants type reservation, then the command shall be completed + * with RESERVATION CONFLICT status. + */ + if ((pr_res_holder->pr_res_type == PR_TYPE_WRITE_EXCLUSIVE_ALLREG) || + (pr_res_holder->pr_res_type == PR_TYPE_EXCLUSIVE_ACCESS_ALLREG)) { + printk(KERN_WARNING "SPC-3 PR REGISTER_AND_MOVE: Unable to move" + " reservation for type: %s\n", + core_scsi3_pr_dump_type(pr_res_holder->pr_res_type)); + spin_unlock(&dev->dev_reservation_lock); + ret = PYX_TRANSPORT_RESERVATION_CONFLICT; + goto out; + } + pr_res_nacl = pr_res_holder->pr_reg_nacl; + /* + * b) Ignore the contents of the (received) SCOPE and TYPE fields; + */ + type = pr_res_holder->pr_res_type; + scope = pr_res_holder->pr_res_type; + /* + * c) Associate the reservation key specified in the SERVICE ACTION + * RESERVATION KEY field with the I_T nexus specified as the + * destination of the register and move, where: + * A) The I_T nexus is specified by the TransportID and the + * RELATIVE TARGET PORT IDENTIFIER field (see 6.14.4); and + * B) Regardless of the TransportID format used, the association for + * the initiator port is based on either the initiator port name + * (see 3.1.71) on SCSI transport protocols where port names are + * required or the initiator port identifier (see 3.1.70) on SCSI + * transport protocols where port names are not required; + * d) Register the reservation key specified in the SERVICE ACTION + * RESERVATION KEY field; + * e) Retain the reservation key specified in the SERVICE ACTION + * RESERVATION KEY field and associated information; + * + * Also, It is not an error for a REGISTER AND MOVE service action to + * register an I_T nexus that is already registered with the same + * reservation key or a different reservation key. + */ + dest_pr_reg = __core_scsi3_locate_pr_reg(dev, dest_node_acl, + iport_ptr); + if (!(dest_pr_reg)) { + ret = core_scsi3_alloc_registration(SE_DEV(cmd), + dest_node_acl, dest_se_deve, iport_ptr, + sa_res_key, 0, aptpl, 2, 1); + if (ret != 0) { + spin_unlock(&dev->dev_reservation_lock); + ret = PYX_TRANSPORT_INVALID_PARAMETER_LIST; + goto out; + } + dest_pr_reg = __core_scsi3_locate_pr_reg(dev, dest_node_acl, + iport_ptr); + new_reg = 1; + } + /* + * f) Release the persistent reservation for the persistent reservation + * holder (i.e., the I_T nexus on which the + */ + __core_scsi3_complete_pro_release(dev, pr_res_nacl, + dev->dev_pr_res_holder, 0); + /* + * g) Move the persistent reservation to the specified I_T nexus using + * the same scope and type as the persistent reservation released in + * item f); and + */ + dev->dev_pr_res_holder = dest_pr_reg; + dest_pr_reg->pr_res_holder = 1; + dest_pr_reg->pr_res_type = type; + pr_reg->pr_res_scope = scope; + prf_isid = core_pr_dump_initiator_port(pr_reg, &i_buf[0], + PR_REG_ISID_ID_LEN); + /* + * Increment PRGeneration for existing registrations.. + */ + if (!(new_reg)) + dest_pr_reg->pr_res_generation = pr_tmpl->pr_generation++; + spin_unlock(&dev->dev_reservation_lock); + + printk(KERN_INFO "SPC-3 PR [%s] Service Action: REGISTER_AND_MOVE" + " created new reservation holder TYPE: %s on object RTPI:" + " %hu PRGeneration: 0x%08x\n", dest_tf_ops->get_fabric_name(), + core_scsi3_pr_dump_type(type), rtpi, + dest_pr_reg->pr_res_generation); + printk(KERN_INFO "SPC-3 PR Successfully moved reservation from" + " %s Fabric Node: %s%s -> %s Fabric Node: %s %s\n", + tf_ops->get_fabric_name(), pr_reg_nacl->initiatorname, + (prf_isid) ? &i_buf[0] : "", dest_tf_ops->get_fabric_name(), + dest_node_acl->initiatorname, (iport_ptr != NULL) ? + iport_ptr : ""); + /* + * It is now safe to release configfs group dependencies for destination + * of Transport ID Initiator Device/Port Identifier + */ + core_scsi3_lunacl_undepend_item(dest_se_deve); + core_scsi3_nodeacl_undepend_item(dest_node_acl); + core_scsi3_tpg_undepend_item(dest_se_tpg); + /* + * h) If the UNREG bit is set to one, unregister (see 5.7.11.3) the I_T + * nexus on which PERSISTENT RESERVE OUT command was received. + */ + if (unreg) { + spin_lock(&pr_tmpl->registration_lock); + __core_scsi3_free_registration(dev, pr_reg, NULL, 1); + spin_unlock(&pr_tmpl->registration_lock); + } else + core_scsi3_put_pr_reg(pr_reg); + + /* + * Clear the APTPL metadata if APTPL has been disabled, otherwise + * write out the updated metadata to struct file for this SCSI device. + */ + if (!(aptpl)) { + pr_tmpl->pr_aptpl_active = 0; + core_scsi3_update_and_write_aptpl(SE_DEV(cmd), NULL, 0); + printk("SPC-3 PR: Set APTPL Bit Deactivated for" + " REGISTER_AND_MOVE\n"); + } else { + pr_tmpl->pr_aptpl_active = 1; + ret = core_scsi3_update_and_write_aptpl(SE_DEV(cmd), + &dest_pr_reg->pr_aptpl_buf[0], + pr_tmpl->pr_aptpl_buf_len); + if (!(ret)) + printk("SPC-3 PR: Set APTPL Bit Activated for" + " REGISTER_AND_MOVE\n"); + } + + core_scsi3_put_pr_reg(dest_pr_reg); + return 0; +out: + if (dest_se_deve) + core_scsi3_lunacl_undepend_item(dest_se_deve); + if (dest_node_acl) + core_scsi3_nodeacl_undepend_item(dest_node_acl); + core_scsi3_tpg_undepend_item(dest_se_tpg); + core_scsi3_put_pr_reg(pr_reg); + return ret; +} + +static unsigned long long core_scsi3_extract_reservation_key(unsigned char *cdb) +{ + unsigned int __v1, __v2; + + __v1 = (cdb[0] << 24) | (cdb[1] << 16) | (cdb[2] << 8) | cdb[3]; + __v2 = (cdb[4] << 24) | (cdb[5] << 16) | (cdb[6] << 8) | cdb[7]; + + return ((unsigned long long)__v2) | (unsigned long long)__v1 << 32; +} + +/* + * See spc4r17 section 6.14 Table 170 + */ +static int core_scsi3_emulate_pr_out(struct se_cmd *cmd, unsigned char *cdb) +{ + unsigned char *buf = (unsigned char *)T_TASK(cmd)->t_task_buf; + u64 res_key, sa_res_key; + int sa, scope, type, aptpl; + int spec_i_pt = 0, all_tg_pt = 0, unreg = 0; + /* + * FIXME: A NULL struct se_session pointer means an this is not coming from + * a $FABRIC_MOD's nexus, but from internal passthrough ops. + */ + if (!(SE_SESS(cmd))) + return PYX_TRANSPORT_LU_COMM_FAILURE; + + if (cmd->data_length < 24) { + printk(KERN_WARNING "SPC-PR: Recieved PR OUT parameter list" + " length too small: %u\n", cmd->data_length); + return PYX_TRANSPORT_INVALID_PARAMETER_LIST; + } + /* + * From the PERSISTENT_RESERVE_OUT command descriptor block (CDB) + */ + sa = (cdb[1] & 0x1f); + scope = (cdb[2] & 0xf0); + type = (cdb[2] & 0x0f); + /* + * From PERSISTENT_RESERVE_OUT parameter list (payload) + */ + res_key = core_scsi3_extract_reservation_key(&buf[0]); + sa_res_key = core_scsi3_extract_reservation_key(&buf[8]); + /* + * REGISTER_AND_MOVE uses a different SA parameter list containing + * SCSI TransportIDs. + */ + if (sa != PRO_REGISTER_AND_MOVE) { + spec_i_pt = (buf[20] & 0x08); + all_tg_pt = (buf[20] & 0x04); + aptpl = (buf[20] & 0x01); + } else { + aptpl = (buf[17] & 0x01); + unreg = (buf[17] & 0x02); + } + /* + * SPEC_I_PT=1 is only valid for Service action: REGISTER + */ + if (spec_i_pt && ((cdb[1] & 0x1f) != PRO_REGISTER)) + return PYX_TRANSPORT_INVALID_PARAMETER_LIST; + /* + * From spc4r17 section 6.14: + * + * If the SPEC_I_PT bit is set to zero, the service action is not + * REGISTER AND MOVE, and the parameter list length is not 24, then + * the command shall be terminated with CHECK CONDITION status, with + * the sense key set to ILLEGAL REQUEST, and the additional sense + * code set to PARAMETER LIST LENGTH ERROR. + */ + if (!(spec_i_pt) && ((cdb[1] & 0x1f) != PRO_REGISTER_AND_MOVE) && + (cmd->data_length != 24)) { + printk(KERN_WARNING "SPC-PR: Recieved PR OUT illegal parameter" + " list length: %u\n", cmd->data_length); + return PYX_TRANSPORT_INVALID_PARAMETER_LIST; + } + /* + * (core_scsi3_emulate_pro_* function parameters + * are defined by spc4r17 Table 174: + * PERSISTENT_RESERVE_OUT service actions and valid parameters. + */ + switch (sa) { + case PRO_REGISTER: + return core_scsi3_emulate_pro_register(cmd, + res_key, sa_res_key, aptpl, all_tg_pt, spec_i_pt, 0); + case PRO_RESERVE: + return core_scsi3_emulate_pro_reserve(cmd, + type, scope, res_key); + case PRO_RELEASE: + return core_scsi3_emulate_pro_release(cmd, + type, scope, res_key); + case PRO_CLEAR: + return core_scsi3_emulate_pro_clear(cmd, res_key); + case PRO_PREEMPT: + return core_scsi3_emulate_pro_preempt(cmd, type, scope, + res_key, sa_res_key, 0); + case PRO_PREEMPT_AND_ABORT: + return core_scsi3_emulate_pro_preempt(cmd, type, scope, + res_key, sa_res_key, 1); + case PRO_REGISTER_AND_IGNORE_EXISTING_KEY: + return core_scsi3_emulate_pro_register(cmd, + 0, sa_res_key, aptpl, all_tg_pt, spec_i_pt, 1); + case PRO_REGISTER_AND_MOVE: + return core_scsi3_emulate_pro_register_and_move(cmd, res_key, + sa_res_key, aptpl, unreg); + default: + printk(KERN_ERR "Unknown PERSISTENT_RESERVE_OUT service" + " action: 0x%02x\n", cdb[1] & 0x1f); + return PYX_TRANSPORT_INVALID_CDB_FIELD; + } + + return PYX_TRANSPORT_INVALID_CDB_FIELD; +} + +/* + * PERSISTENT_RESERVE_IN Service Action READ_KEYS + * + * See spc4r17 section 5.7.6.2 and section 6.13.2, Table 160 + */ +static int core_scsi3_pri_read_keys(struct se_cmd *cmd) +{ + struct se_device *se_dev = SE_DEV(cmd); + struct se_subsystem_dev *su_dev = SU_DEV(se_dev); + struct t10_pr_registration *pr_reg; + unsigned char *buf = (unsigned char *)T_TASK(cmd)->t_task_buf; + u32 add_len = 0, off = 8; + + if (cmd->data_length < 8) { + printk(KERN_ERR "PRIN SA READ_KEYS SCSI Data Length: %u" + " too small\n", cmd->data_length); + return PYX_TRANSPORT_INVALID_CDB_FIELD; + } + + buf[0] = ((T10_RES(su_dev)->pr_generation >> 24) & 0xff); + buf[1] = ((T10_RES(su_dev)->pr_generation >> 16) & 0xff); + buf[2] = ((T10_RES(su_dev)->pr_generation >> 8) & 0xff); + buf[3] = (T10_RES(su_dev)->pr_generation & 0xff); + + spin_lock(&T10_RES(su_dev)->registration_lock); + list_for_each_entry(pr_reg, &T10_RES(su_dev)->registration_list, + pr_reg_list) { + /* + * Check for overflow of 8byte PRI READ_KEYS payload and + * next reservation key list descriptor. + */ + if ((add_len + 8) > (cmd->data_length - 8)) + break; + + buf[off++] = ((pr_reg->pr_res_key >> 56) & 0xff); + buf[off++] = ((pr_reg->pr_res_key >> 48) & 0xff); + buf[off++] = ((pr_reg->pr_res_key >> 40) & 0xff); + buf[off++] = ((pr_reg->pr_res_key >> 32) & 0xff); + buf[off++] = ((pr_reg->pr_res_key >> 24) & 0xff); + buf[off++] = ((pr_reg->pr_res_key >> 16) & 0xff); + buf[off++] = ((pr_reg->pr_res_key >> 8) & 0xff); + buf[off++] = (pr_reg->pr_res_key & 0xff); + + add_len += 8; + } + spin_unlock(&T10_RES(su_dev)->registration_lock); + + buf[4] = ((add_len >> 24) & 0xff); + buf[5] = ((add_len >> 16) & 0xff); + buf[6] = ((add_len >> 8) & 0xff); + buf[7] = (add_len & 0xff); + + return 0; +} + +/* + * PERSISTENT_RESERVE_IN Service Action READ_RESERVATION + * + * See spc4r17 section 5.7.6.3 and section 6.13.3.2 Table 161 and 162 + */ +static int core_scsi3_pri_read_reservation(struct se_cmd *cmd) +{ + struct se_device *se_dev = SE_DEV(cmd); + struct se_subsystem_dev *su_dev = SU_DEV(se_dev); + struct t10_pr_registration *pr_reg; + unsigned char *buf = (unsigned char *)T_TASK(cmd)->t_task_buf; + u64 pr_res_key; + u32 add_len = 16; /* Hardcoded to 16 when a reservation is held. */ + + if (cmd->data_length < 8) { + printk(KERN_ERR "PRIN SA READ_RESERVATIONS SCSI Data Length: %u" + " too small\n", cmd->data_length); + return PYX_TRANSPORT_INVALID_CDB_FIELD; + } + + buf[0] = ((T10_RES(su_dev)->pr_generation >> 24) & 0xff); + buf[1] = ((T10_RES(su_dev)->pr_generation >> 16) & 0xff); + buf[2] = ((T10_RES(su_dev)->pr_generation >> 8) & 0xff); + buf[3] = (T10_RES(su_dev)->pr_generation & 0xff); + + spin_lock(&se_dev->dev_reservation_lock); + pr_reg = se_dev->dev_pr_res_holder; + if ((pr_reg)) { + /* + * Set the hardcoded Additional Length + */ + buf[4] = ((add_len >> 24) & 0xff); + buf[5] = ((add_len >> 16) & 0xff); + buf[6] = ((add_len >> 8) & 0xff); + buf[7] = (add_len & 0xff); + + if (cmd->data_length < 22) { + spin_unlock(&se_dev->dev_reservation_lock); + return 0; + } + /* + * Set the Reservation key. + * + * From spc4r17, section 5.7.10: + * A persistent reservation holder has its reservation key + * returned in the parameter data from a PERSISTENT + * RESERVE IN command with READ RESERVATION service action as + * follows: + * a) For a persistent reservation of the type Write Exclusive + * - All Registrants or Exclusive Access ­ All Regitrants, + * the reservation key shall be set to zero; or + * b) For all other persistent reservation types, the + * reservation key shall be set to the registered + * reservation key for the I_T nexus that holds the + * persistent reservation. + */ + if ((pr_reg->pr_res_type == PR_TYPE_WRITE_EXCLUSIVE_ALLREG) || + (pr_reg->pr_res_type == PR_TYPE_EXCLUSIVE_ACCESS_ALLREG)) + pr_res_key = 0; + else + pr_res_key = pr_reg->pr_res_key; + + buf[8] = ((pr_res_key >> 56) & 0xff); + buf[9] = ((pr_res_key >> 48) & 0xff); + buf[10] = ((pr_res_key >> 40) & 0xff); + buf[11] = ((pr_res_key >> 32) & 0xff); + buf[12] = ((pr_res_key >> 24) & 0xff); + buf[13] = ((pr_res_key >> 16) & 0xff); + buf[14] = ((pr_res_key >> 8) & 0xff); + buf[15] = (pr_res_key & 0xff); + /* + * Set the SCOPE and TYPE + */ + buf[21] = (pr_reg->pr_res_scope & 0xf0) | + (pr_reg->pr_res_type & 0x0f); + } + spin_unlock(&se_dev->dev_reservation_lock); + + return 0; +} + +/* + * PERSISTENT_RESERVE_IN Service Action REPORT_CAPABILITIES + * + * See spc4r17 section 6.13.4 Table 165 + */ +static int core_scsi3_pri_report_capabilities(struct se_cmd *cmd) +{ + struct se_device *dev = SE_DEV(cmd); + struct t10_reservation_template *pr_tmpl = &SU_DEV(dev)->t10_reservation; + unsigned char *buf = (unsigned char *)T_TASK(cmd)->t_task_buf; + u16 add_len = 8; /* Hardcoded to 8. */ + + if (cmd->data_length < 6) { + printk(KERN_ERR "PRIN SA REPORT_CAPABILITIES SCSI Data Length:" + " %u too small\n", cmd->data_length); + return PYX_TRANSPORT_INVALID_CDB_FIELD; + } + + buf[0] = ((add_len << 8) & 0xff); + buf[1] = (add_len & 0xff); + buf[2] |= 0x10; /* CRH: Compatible Reservation Hanlding bit. */ + buf[2] |= 0x08; /* SIP_C: Specify Initiator Ports Capable bit */ + buf[2] |= 0x04; /* ATP_C: All Target Ports Capable bit */ + buf[2] |= 0x01; /* PTPL_C: Persistence across Target Power Loss bit */ + /* + * We are filling in the PERSISTENT RESERVATION TYPE MASK below, so + * set the TMV: Task Mask Valid bit. + */ + buf[3] |= 0x80; + /* + * Change ALLOW COMMANDs to 0x20 or 0x40 later from Table 166 + */ + buf[3] |= 0x10; /* ALLOW COMMANDs field 001b */ + /* + * PTPL_A: Persistence across Target Power Loss Active bit + */ + if (pr_tmpl->pr_aptpl_active) + buf[3] |= 0x01; + /* + * Setup the PERSISTENT RESERVATION TYPE MASK from Table 167 + */ + buf[4] |= 0x80; /* PR_TYPE_EXCLUSIVE_ACCESS_ALLREG */ + buf[4] |= 0x40; /* PR_TYPE_EXCLUSIVE_ACCESS_REGONLY */ + buf[4] |= 0x20; /* PR_TYPE_WRITE_EXCLUSIVE_REGONLY */ + buf[4] |= 0x08; /* PR_TYPE_EXCLUSIVE_ACCESS */ + buf[4] |= 0x02; /* PR_TYPE_WRITE_EXCLUSIVE */ + buf[5] |= 0x01; /* PR_TYPE_EXCLUSIVE_ACCESS_ALLREG */ + + return 0; +} + +/* + * PERSISTENT_RESERVE_IN Service Action READ_FULL_STATUS + * + * See spc4r17 section 6.13.5 Table 168 and 169 + */ +static int core_scsi3_pri_read_full_status(struct se_cmd *cmd) +{ + struct se_device *se_dev = SE_DEV(cmd); + struct se_node_acl *se_nacl; + struct se_subsystem_dev *su_dev = SU_DEV(se_dev); + struct se_portal_group *se_tpg; + struct t10_pr_registration *pr_reg, *pr_reg_tmp; + struct t10_reservation_template *pr_tmpl = &SU_DEV(se_dev)->t10_reservation; + unsigned char *buf = (unsigned char *)T_TASK(cmd)->t_task_buf; + u32 add_desc_len = 0, add_len = 0, desc_len, exp_desc_len; + u32 off = 8; /* off into first Full Status descriptor */ + int format_code = 0; + + if (cmd->data_length < 8) { + printk(KERN_ERR "PRIN SA READ_FULL_STATUS SCSI Data Length: %u" + " too small\n", cmd->data_length); + return PYX_TRANSPORT_INVALID_CDB_FIELD; + } + + buf[0] = ((T10_RES(su_dev)->pr_generation >> 24) & 0xff); + buf[1] = ((T10_RES(su_dev)->pr_generation >> 16) & 0xff); + buf[2] = ((T10_RES(su_dev)->pr_generation >> 8) & 0xff); + buf[3] = (T10_RES(su_dev)->pr_generation & 0xff); + + spin_lock(&pr_tmpl->registration_lock); + list_for_each_entry_safe(pr_reg, pr_reg_tmp, + &pr_tmpl->registration_list, pr_reg_list) { + + se_nacl = pr_reg->pr_reg_nacl; + se_tpg = pr_reg->pr_reg_nacl->se_tpg; + add_desc_len = 0; + + atomic_inc(&pr_reg->pr_res_holders); + smp_mb__after_atomic_inc(); + spin_unlock(&pr_tmpl->registration_lock); + /* + * Determine expected length of $FABRIC_MOD specific + * TransportID full status descriptor.. + */ + exp_desc_len = TPG_TFO(se_tpg)->tpg_get_pr_transport_id_len( + se_tpg, se_nacl, pr_reg, &format_code); + + if ((exp_desc_len + add_len) > cmd->data_length) { + printk(KERN_WARNING "SPC-3 PRIN READ_FULL_STATUS ran" + " out of buffer: %d\n", cmd->data_length); + spin_lock(&pr_tmpl->registration_lock); + atomic_dec(&pr_reg->pr_res_holders); + smp_mb__after_atomic_dec(); + break; + } + /* + * Set RESERVATION KEY + */ + buf[off++] = ((pr_reg->pr_res_key >> 56) & 0xff); + buf[off++] = ((pr_reg->pr_res_key >> 48) & 0xff); + buf[off++] = ((pr_reg->pr_res_key >> 40) & 0xff); + buf[off++] = ((pr_reg->pr_res_key >> 32) & 0xff); + buf[off++] = ((pr_reg->pr_res_key >> 24) & 0xff); + buf[off++] = ((pr_reg->pr_res_key >> 16) & 0xff); + buf[off++] = ((pr_reg->pr_res_key >> 8) & 0xff); + buf[off++] = (pr_reg->pr_res_key & 0xff); + off += 4; /* Skip Over Reserved area */ + + /* + * Set ALL_TG_PT bit if PROUT SA REGISTER had this set. + */ + if (pr_reg->pr_reg_all_tg_pt) + buf[off] = 0x02; + /* + * The struct se_lun pointer will be present for the + * reservation holder for PR_HOLDER bit. + * + * Also, if this registration is the reservation + * holder, fill in SCOPE and TYPE in the next byte. + */ + if (pr_reg->pr_res_holder) { + buf[off++] |= 0x01; + buf[off++] = (pr_reg->pr_res_scope & 0xf0) | + (pr_reg->pr_res_type & 0x0f); + } else + off += 2; + + off += 4; /* Skip over reserved area */ + /* + * From spc4r17 6.3.15: + * + * If the ALL_TG_PT bit set to zero, the RELATIVE TARGET PORT + * IDENTIFIER field contains the relative port identifier (see + * 3.1.120) of the target port that is part of the I_T nexus + * described by this full status descriptor. If the ALL_TG_PT + * bit is set to one, the contents of the RELATIVE TARGET PORT + * IDENTIFIER field are not defined by this standard. + */ + if (!(pr_reg->pr_reg_all_tg_pt)) { + struct se_port *port = pr_reg->pr_reg_tg_pt_lun->lun_sep; + + buf[off++] = ((port->sep_rtpi >> 8) & 0xff); + buf[off++] = (port->sep_rtpi & 0xff); + } else + off += 2; /* Skip over RELATIVE TARGET PORT IDENTIFER */ + + /* + * Now, have the $FABRIC_MOD fill in the protocol identifier + */ + desc_len = TPG_TFO(se_tpg)->tpg_get_pr_transport_id(se_tpg, + se_nacl, pr_reg, &format_code, &buf[off+4]); + + spin_lock(&pr_tmpl->registration_lock); + atomic_dec(&pr_reg->pr_res_holders); + smp_mb__after_atomic_dec(); + /* + * Set the ADDITIONAL DESCRIPTOR LENGTH + */ + buf[off++] = ((desc_len >> 24) & 0xff); + buf[off++] = ((desc_len >> 16) & 0xff); + buf[off++] = ((desc_len >> 8) & 0xff); + buf[off++] = (desc_len & 0xff); + /* + * Size of full desctipor header minus TransportID + * containing $FABRIC_MOD specific) initiator device/port + * WWN information. + * + * See spc4r17 Section 6.13.5 Table 169 + */ + add_desc_len = (24 + desc_len); + + off += desc_len; + add_len += add_desc_len; + } + spin_unlock(&pr_tmpl->registration_lock); + /* + * Set ADDITIONAL_LENGTH + */ + buf[4] = ((add_len >> 24) & 0xff); + buf[5] = ((add_len >> 16) & 0xff); + buf[6] = ((add_len >> 8) & 0xff); + buf[7] = (add_len & 0xff); + + return 0; +} + +static int core_scsi3_emulate_pr_in(struct se_cmd *cmd, unsigned char *cdb) +{ + switch (cdb[1] & 0x1f) { + case PRI_READ_KEYS: + return core_scsi3_pri_read_keys(cmd); + case PRI_READ_RESERVATION: + return core_scsi3_pri_read_reservation(cmd); + case PRI_REPORT_CAPABILITIES: + return core_scsi3_pri_report_capabilities(cmd); + case PRI_READ_FULL_STATUS: + return core_scsi3_pri_read_full_status(cmd); + default: + printk(KERN_ERR "Unknown PERSISTENT_RESERVE_IN service" + " action: 0x%02x\n", cdb[1] & 0x1f); + return PYX_TRANSPORT_INVALID_CDB_FIELD; + } + +} + +int core_scsi3_emulate_pr(struct se_cmd *cmd) +{ + unsigned char *cdb = &T_TASK(cmd)->t_task_cdb[0]; + struct se_device *dev = cmd->se_dev; + /* + * Following spc2r20 5.5.1 Reservations overview: + * + * If a logical unit has been reserved by any RESERVE command and is + * still reserved by any initiator, all PERSISTENT RESERVE IN and all + * PERSISTENT RESERVE OUT commands shall conflict regardless of + * initiator or service action and shall terminate with a RESERVATION + * CONFLICT status. + */ + if (dev->dev_flags & DF_SPC2_RESERVATIONS) { + printk(KERN_ERR "Received PERSISTENT_RESERVE CDB while legacy" + " SPC-2 reservation is held, returning" + " RESERVATION_CONFLICT\n"); + return PYX_TRANSPORT_RESERVATION_CONFLICT; + } + + return (cdb[0] == PERSISTENT_RESERVE_OUT) ? + core_scsi3_emulate_pr_out(cmd, cdb) : + core_scsi3_emulate_pr_in(cmd, cdb); +} + +static int core_pt_reservation_check(struct se_cmd *cmd, u32 *pr_res_type) +{ + return 0; +} + +static int core_pt_seq_non_holder( + struct se_cmd *cmd, + unsigned char *cdb, + u32 pr_reg_type) +{ + return 0; +} + +int core_setup_reservations(struct se_device *dev, int force_pt) +{ + struct se_subsystem_dev *su_dev = dev->se_sub_dev; + struct t10_reservation_template *rest = &su_dev->t10_reservation; + /* + * If this device is from Target_Core_Mod/pSCSI, use the reservations + * of the Underlying SCSI hardware. In Linux/SCSI terms, this can + * cause a problem because libata and some SATA RAID HBAs appear + * under Linux/SCSI, but to emulate reservations themselves. + */ + if (((TRANSPORT(dev)->transport_type == TRANSPORT_PLUGIN_PHBA_PDEV) && + !(DEV_ATTRIB(dev)->emulate_reservations)) || force_pt) { + rest->res_type = SPC_PASSTHROUGH; + rest->pr_ops.t10_reservation_check = &core_pt_reservation_check; + rest->pr_ops.t10_seq_non_holder = &core_pt_seq_non_holder; + printk(KERN_INFO "%s: Using SPC_PASSTHROUGH, no reservation" + " emulation\n", TRANSPORT(dev)->name); + return 0; + } + /* + * If SPC-3 or above is reported by real or emulated struct se_device, + * use emulated Persistent Reservations. + */ + if (TRANSPORT(dev)->get_device_rev(dev) >= SCSI_3) { + rest->res_type = SPC3_PERSISTENT_RESERVATIONS; + rest->pr_ops.t10_reservation_check = &core_scsi3_pr_reservation_check; + rest->pr_ops.t10_seq_non_holder = &core_scsi3_pr_seq_non_holder; + printk(KERN_INFO "%s: Using SPC3_PERSISTENT_RESERVATIONS" + " emulation\n", TRANSPORT(dev)->name); + } else { + rest->res_type = SPC2_RESERVATIONS; + rest->pr_ops.t10_reservation_check = &core_scsi2_reservation_check; + rest->pr_ops.t10_seq_non_holder = + &core_scsi2_reservation_seq_non_holder; + printk(KERN_INFO "%s: Using SPC2_RESERVATIONS emulation\n", + TRANSPORT(dev)->name); + } + + return 0; +} diff --git a/drivers/target/target_core_pr.h b/drivers/target/target_core_pr.h new file mode 100644 index 000000000000..5603bcfd86d3 --- /dev/null +++ b/drivers/target/target_core_pr.h @@ -0,0 +1,67 @@ +#ifndef TARGET_CORE_PR_H +#define TARGET_CORE_PR_H +/* + * PERSISTENT_RESERVE_OUT service action codes + * + * spc4r17 section 6.14.2 Table 171 + */ +#define PRO_REGISTER 0x00 +#define PRO_RESERVE 0x01 +#define PRO_RELEASE 0x02 +#define PRO_CLEAR 0x03 +#define PRO_PREEMPT 0x04 +#define PRO_PREEMPT_AND_ABORT 0x05 +#define PRO_REGISTER_AND_IGNORE_EXISTING_KEY 0x06 +#define PRO_REGISTER_AND_MOVE 0x07 +/* + * PERSISTENT_RESERVE_IN service action codes + * + * spc4r17 section 6.13.1 Table 159 + */ +#define PRI_READ_KEYS 0x00 +#define PRI_READ_RESERVATION 0x01 +#define PRI_REPORT_CAPABILITIES 0x02 +#define PRI_READ_FULL_STATUS 0x03 +/* + * PERSISTENT_RESERVE_ SCOPE field + * + * spc4r17 section 6.13.3.3 Table 163 + */ +#define PR_SCOPE_LU_SCOPE 0x00 +/* + * PERSISTENT_RESERVE_* TYPE field + * + * spc4r17 section 6.13.3.4 Table 164 + */ +#define PR_TYPE_WRITE_EXCLUSIVE 0x01 +#define PR_TYPE_EXCLUSIVE_ACCESS 0x03 +#define PR_TYPE_WRITE_EXCLUSIVE_REGONLY 0x05 +#define PR_TYPE_EXCLUSIVE_ACCESS_REGONLY 0x06 +#define PR_TYPE_WRITE_EXCLUSIVE_ALLREG 0x07 +#define PR_TYPE_EXCLUSIVE_ACCESS_ALLREG 0x08 + +#define PR_APTPL_MAX_IPORT_LEN 256 +#define PR_APTPL_MAX_TPORT_LEN 256 + +extern struct kmem_cache *t10_pr_reg_cache; + +extern int core_pr_dump_initiator_port(struct t10_pr_registration *, + char *, u32); +extern int core_scsi2_emulate_crh(struct se_cmd *); +extern int core_scsi3_alloc_aptpl_registration( + struct t10_reservation_template *, u64, + unsigned char *, unsigned char *, u32, + unsigned char *, u16, u32, int, int, u8); +extern int core_scsi3_check_aptpl_registration(struct se_device *, + struct se_portal_group *, struct se_lun *, + struct se_lun_acl *); +extern void core_scsi3_free_pr_reg_from_nacl(struct se_device *, + struct se_node_acl *); +extern void core_scsi3_free_all_registrations(struct se_device *); +extern unsigned char *core_scsi3_pr_dump_type(int); +extern int core_scsi3_check_cdb_abort_and_preempt(struct list_head *, + struct se_cmd *); +extern int core_scsi3_emulate_pr(struct se_cmd *); +extern int core_setup_reservations(struct se_device *, int); + +#endif /* TARGET_CORE_PR_H */ diff --git a/drivers/target/target_core_pscsi.c b/drivers/target/target_core_pscsi.c new file mode 100644 index 000000000000..742d24609a9b --- /dev/null +++ b/drivers/target/target_core_pscsi.c @@ -0,0 +1,1470 @@ +/******************************************************************************* + * Filename: target_core_pscsi.c + * + * This file contains the generic target mode <-> Linux SCSI subsystem plugin. + * + * Copyright (c) 2003, 2004, 2005 PyX Technologies, Inc. + * Copyright (c) 2005, 2006, 2007 SBE, Inc. + * Copyright (c) 2007-2010 Rising Tide Systems + * Copyright (c) 2008-2010 Linux-iSCSI.org + * + * Nicholas A. Bellinger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + ******************************************************************************/ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include /* For TASK_ATTR_* */ + +#include +#include +#include + +#include "target_core_pscsi.h" + +#define ISPRINT(a) ((a >= ' ') && (a <= '~')) + +static struct se_subsystem_api pscsi_template; + +static void pscsi_req_done(struct request *, int); + +/* pscsi_get_sh(): + * + * + */ +static struct Scsi_Host *pscsi_get_sh(u32 host_no) +{ + struct Scsi_Host *sh = NULL; + + sh = scsi_host_lookup(host_no); + if (IS_ERR(sh)) { + printk(KERN_ERR "Unable to locate SCSI HBA with Host ID:" + " %u\n", host_no); + return NULL; + } + + return sh; +} + +/* pscsi_attach_hba(): + * + * pscsi_get_sh() used scsi_host_lookup() to locate struct Scsi_Host. + * from the passed SCSI Host ID. + */ +static int pscsi_attach_hba(struct se_hba *hba, u32 host_id) +{ + int hba_depth; + struct pscsi_hba_virt *phv; + + phv = kzalloc(sizeof(struct pscsi_hba_virt), GFP_KERNEL); + if (!(phv)) { + printk(KERN_ERR "Unable to allocate struct pscsi_hba_virt\n"); + return -1; + } + phv->phv_host_id = host_id; + phv->phv_mode = PHV_VIRUTAL_HOST_ID; + hba_depth = PSCSI_VIRTUAL_HBA_DEPTH; + atomic_set(&hba->left_queue_depth, hba_depth); + atomic_set(&hba->max_queue_depth, hba_depth); + + hba->hba_ptr = (void *)phv; + + printk(KERN_INFO "CORE_HBA[%d] - TCM SCSI HBA Driver %s on" + " Generic Target Core Stack %s\n", hba->hba_id, + PSCSI_VERSION, TARGET_CORE_MOD_VERSION); + printk(KERN_INFO "CORE_HBA[%d] - Attached SCSI HBA to Generic" + " Target Core with TCQ Depth: %d\n", hba->hba_id, + atomic_read(&hba->max_queue_depth)); + + return 0; +} + +static void pscsi_detach_hba(struct se_hba *hba) +{ + struct pscsi_hba_virt *phv = hba->hba_ptr; + struct Scsi_Host *scsi_host = phv->phv_lld_host; + + if (scsi_host) { + scsi_host_put(scsi_host); + + printk(KERN_INFO "CORE_HBA[%d] - Detached SCSI HBA: %s from" + " Generic Target Core\n", hba->hba_id, + (scsi_host->hostt->name) ? (scsi_host->hostt->name) : + "Unknown"); + } else + printk(KERN_INFO "CORE_HBA[%d] - Detached Virtual SCSI HBA" + " from Generic Target Core\n", hba->hba_id); + + kfree(phv); + hba->hba_ptr = NULL; +} + +static int pscsi_pmode_enable_hba(struct se_hba *hba, unsigned long mode_flag) +{ + struct pscsi_hba_virt *phv = (struct pscsi_hba_virt *)hba->hba_ptr; + struct Scsi_Host *sh = phv->phv_lld_host; + int hba_depth = PSCSI_VIRTUAL_HBA_DEPTH; + /* + * Release the struct Scsi_Host + */ + if (!(mode_flag)) { + if (!(sh)) + return 0; + + phv->phv_lld_host = NULL; + phv->phv_mode = PHV_VIRUTAL_HOST_ID; + atomic_set(&hba->left_queue_depth, hba_depth); + atomic_set(&hba->max_queue_depth, hba_depth); + + printk(KERN_INFO "CORE_HBA[%d] - Disabled pSCSI HBA Passthrough" + " %s\n", hba->hba_id, (sh->hostt->name) ? + (sh->hostt->name) : "Unknown"); + + scsi_host_put(sh); + return 0; + } + /* + * Otherwise, locate struct Scsi_Host from the original passed + * pSCSI Host ID and enable for phba mode + */ + sh = pscsi_get_sh(phv->phv_host_id); + if (!(sh)) { + printk(KERN_ERR "pSCSI: Unable to locate SCSI Host for" + " phv_host_id: %d\n", phv->phv_host_id); + return -1; + } + /* + * Usually the SCSI LLD will use the hostt->can_queue value to define + * its HBA TCQ depth. Some other drivers (like 2.6 megaraid) don't set + * this at all and set sh->can_queue at runtime. + */ + hba_depth = (sh->hostt->can_queue > sh->can_queue) ? + sh->hostt->can_queue : sh->can_queue; + + atomic_set(&hba->left_queue_depth, hba_depth); + atomic_set(&hba->max_queue_depth, hba_depth); + + phv->phv_lld_host = sh; + phv->phv_mode = PHV_LLD_SCSI_HOST_NO; + + printk(KERN_INFO "CORE_HBA[%d] - Enabled pSCSI HBA Passthrough %s\n", + hba->hba_id, (sh->hostt->name) ? (sh->hostt->name) : "Unknown"); + + return 1; +} + +static void pscsi_tape_read_blocksize(struct se_device *dev, + struct scsi_device *sdev) +{ + unsigned char cdb[MAX_COMMAND_SIZE], *buf; + int ret; + + buf = kzalloc(12, GFP_KERNEL); + if (!buf) + return; + + memset(cdb, 0, MAX_COMMAND_SIZE); + cdb[0] = MODE_SENSE; + cdb[4] = 0x0c; /* 12 bytes */ + + ret = scsi_execute_req(sdev, cdb, DMA_FROM_DEVICE, buf, 12, NULL, + HZ, 1, NULL); + if (ret) + goto out_free; + + /* + * If MODE_SENSE still returns zero, set the default value to 1024. + */ + sdev->sector_size = (buf[9] << 16) | (buf[10] << 8) | (buf[11]); + if (!sdev->sector_size) + sdev->sector_size = 1024; +out_free: + kfree(buf); +} + +static void +pscsi_set_inquiry_info(struct scsi_device *sdev, struct t10_wwn *wwn) +{ + unsigned char *buf; + + if (sdev->inquiry_len < INQUIRY_LEN) + return; + + buf = sdev->inquiry; + if (!buf) + return; + /* + * Use sdev->inquiry from drivers/scsi/scsi_scan.c:scsi_alloc_sdev() + */ + memcpy(&wwn->vendor[0], &buf[8], sizeof(wwn->vendor)); + memcpy(&wwn->model[0], &buf[16], sizeof(wwn->model)); + memcpy(&wwn->revision[0], &buf[32], sizeof(wwn->revision)); +} + +static int +pscsi_get_inquiry_vpd_serial(struct scsi_device *sdev, struct t10_wwn *wwn) +{ + unsigned char cdb[MAX_COMMAND_SIZE], *buf; + int ret; + + buf = kzalloc(INQUIRY_VPD_SERIAL_LEN, GFP_KERNEL); + if (!buf) + return -1; + + memset(cdb, 0, MAX_COMMAND_SIZE); + cdb[0] = INQUIRY; + cdb[1] = 0x01; /* Query VPD */ + cdb[2] = 0x80; /* Unit Serial Number */ + cdb[3] = (INQUIRY_VPD_SERIAL_LEN >> 8) & 0xff; + cdb[4] = (INQUIRY_VPD_SERIAL_LEN & 0xff); + + ret = scsi_execute_req(sdev, cdb, DMA_FROM_DEVICE, buf, + INQUIRY_VPD_SERIAL_LEN, NULL, HZ, 1, NULL); + if (ret) + goto out_free; + + snprintf(&wwn->unit_serial[0], INQUIRY_VPD_SERIAL_LEN, "%s", &buf[4]); + + wwn->t10_sub_dev->su_dev_flags |= SDF_FIRMWARE_VPD_UNIT_SERIAL; + + kfree(buf); + return 0; + +out_free: + kfree(buf); + return -1; +} + +static void +pscsi_get_inquiry_vpd_device_ident(struct scsi_device *sdev, + struct t10_wwn *wwn) +{ + unsigned char cdb[MAX_COMMAND_SIZE], *buf, *page_83; + int ident_len, page_len, off = 4, ret; + struct t10_vpd *vpd; + + buf = kzalloc(INQUIRY_VPD_SERIAL_LEN, GFP_KERNEL); + if (!buf) + return; + + memset(cdb, 0, MAX_COMMAND_SIZE); + cdb[0] = INQUIRY; + cdb[1] = 0x01; /* Query VPD */ + cdb[2] = 0x83; /* Device Identifier */ + cdb[3] = (INQUIRY_VPD_DEVICE_IDENTIFIER_LEN >> 8) & 0xff; + cdb[4] = (INQUIRY_VPD_DEVICE_IDENTIFIER_LEN & 0xff); + + ret = scsi_execute_req(sdev, cdb, DMA_FROM_DEVICE, buf, + INQUIRY_VPD_DEVICE_IDENTIFIER_LEN, + NULL, HZ, 1, NULL); + if (ret) + goto out; + + page_len = (buf[2] << 8) | buf[3]; + while (page_len > 0) { + /* Grab a pointer to the Identification descriptor */ + page_83 = &buf[off]; + ident_len = page_83[3]; + if (!ident_len) { + printk(KERN_ERR "page_83[3]: identifier" + " length zero!\n"); + break; + } + printk(KERN_INFO "T10 VPD Identifer Length: %d\n", ident_len); + + vpd = kzalloc(sizeof(struct t10_vpd), GFP_KERNEL); + if (!vpd) { + printk(KERN_ERR "Unable to allocate memory for" + " struct t10_vpd\n"); + goto out; + } + INIT_LIST_HEAD(&vpd->vpd_list); + + transport_set_vpd_proto_id(vpd, page_83); + transport_set_vpd_assoc(vpd, page_83); + + if (transport_set_vpd_ident_type(vpd, page_83) < 0) { + off += (ident_len + 4); + page_len -= (ident_len + 4); + kfree(vpd); + continue; + } + if (transport_set_vpd_ident(vpd, page_83) < 0) { + off += (ident_len + 4); + page_len -= (ident_len + 4); + kfree(vpd); + continue; + } + + list_add_tail(&vpd->vpd_list, &wwn->t10_vpd_list); + off += (ident_len + 4); + page_len -= (ident_len + 4); + } + +out: + kfree(buf); +} + +/* pscsi_add_device_to_list(): + * + * + */ +static struct se_device *pscsi_add_device_to_list( + struct se_hba *hba, + struct se_subsystem_dev *se_dev, + struct pscsi_dev_virt *pdv, + struct scsi_device *sd, + int dev_flags) +{ + struct se_device *dev; + struct se_dev_limits dev_limits; + struct request_queue *q; + struct queue_limits *limits; + + memset(&dev_limits, 0, sizeof(struct se_dev_limits)); + + if (!sd->queue_depth) { + sd->queue_depth = PSCSI_DEFAULT_QUEUEDEPTH; + + printk(KERN_ERR "Set broken SCSI Device %d:%d:%d" + " queue_depth to %d\n", sd->channel, sd->id, + sd->lun, sd->queue_depth); + } + /* + * Setup the local scope queue_limits from struct request_queue->limits + * to pass into transport_add_device_to_core_hba() as struct se_dev_limits. + */ + q = sd->request_queue; + limits = &dev_limits.limits; + limits->logical_block_size = sd->sector_size; + limits->max_hw_sectors = (sd->host->max_sectors > queue_max_hw_sectors(q)) ? + queue_max_hw_sectors(q) : sd->host->max_sectors; + limits->max_sectors = (sd->host->max_sectors > queue_max_sectors(q)) ? + queue_max_sectors(q) : sd->host->max_sectors; + dev_limits.hw_queue_depth = sd->queue_depth; + dev_limits.queue_depth = sd->queue_depth; + /* + * Setup our standard INQUIRY info into se_dev->t10_wwn + */ + pscsi_set_inquiry_info(sd, &se_dev->t10_wwn); + + /* + * Set the pointer pdv->pdv_sd to from passed struct scsi_device, + * which has already been referenced with Linux SCSI code with + * scsi_device_get() in this file's pscsi_create_virtdevice(). + * + * The passthrough operations called by the transport_add_device_* + * function below will require this pointer to be set for passthroug + * ops. + * + * For the shutdown case in pscsi_free_device(), this struct + * scsi_device reference is released with Linux SCSI code + * scsi_device_put() and the pdv->pdv_sd cleared. + */ + pdv->pdv_sd = sd; + + dev = transport_add_device_to_core_hba(hba, &pscsi_template, + se_dev, dev_flags, (void *)pdv, + &dev_limits, NULL, NULL); + if (!(dev)) { + pdv->pdv_sd = NULL; + return NULL; + } + + /* + * Locate VPD WWN Information used for various purposes within + * the Storage Engine. + */ + if (!pscsi_get_inquiry_vpd_serial(sd, &se_dev->t10_wwn)) { + /* + * If VPD Unit Serial returned GOOD status, try + * VPD Device Identification page (0x83). + */ + pscsi_get_inquiry_vpd_device_ident(sd, &se_dev->t10_wwn); + } + + /* + * For TYPE_TAPE, attempt to determine blocksize with MODE_SENSE. + */ + if (sd->type == TYPE_TAPE) + pscsi_tape_read_blocksize(dev, sd); + return dev; +} + +static void *pscsi_allocate_virtdevice(struct se_hba *hba, const char *name) +{ + struct pscsi_dev_virt *pdv; + + pdv = kzalloc(sizeof(struct pscsi_dev_virt), GFP_KERNEL); + if (!(pdv)) { + printk(KERN_ERR "Unable to allocate memory for struct pscsi_dev_virt\n"); + return NULL; + } + pdv->pdv_se_hba = hba; + + printk(KERN_INFO "PSCSI: Allocated pdv: %p for %s\n", pdv, name); + return (void *)pdv; +} + +/* + * Called with struct Scsi_Host->host_lock called. + */ +static struct se_device *pscsi_create_type_disk( + struct scsi_device *sd, + struct pscsi_dev_virt *pdv, + struct se_subsystem_dev *se_dev, + struct se_hba *hba) +{ + struct se_device *dev; + struct pscsi_hba_virt *phv = (struct pscsi_hba_virt *)pdv->pdv_se_hba->hba_ptr; + struct Scsi_Host *sh = sd->host; + struct block_device *bd; + u32 dev_flags = 0; + + if (scsi_device_get(sd)) { + printk(KERN_ERR "scsi_device_get() failed for %d:%d:%d:%d\n", + sh->host_no, sd->channel, sd->id, sd->lun); + spin_unlock_irq(sh->host_lock); + return NULL; + } + spin_unlock_irq(sh->host_lock); + /* + * Claim exclusive struct block_device access to struct scsi_device + * for TYPE_DISK using supplied udev_path + */ + bd = blkdev_get_by_path(se_dev->se_dev_udev_path, + FMODE_WRITE|FMODE_READ|FMODE_EXCL, pdv); + if (!(bd)) { + printk("pSCSI: blkdev_get_by_path() failed\n"); + scsi_device_put(sd); + return NULL; + } + pdv->pdv_bd = bd; + + dev = pscsi_add_device_to_list(hba, se_dev, pdv, sd, dev_flags); + if (!(dev)) { + blkdev_put(pdv->pdv_bd, FMODE_WRITE|FMODE_READ|FMODE_EXCL); + scsi_device_put(sd); + return NULL; + } + printk(KERN_INFO "CORE_PSCSI[%d] - Added TYPE_DISK for %d:%d:%d:%d\n", + phv->phv_host_id, sh->host_no, sd->channel, sd->id, sd->lun); + + return dev; +} + +/* + * Called with struct Scsi_Host->host_lock called. + */ +static struct se_device *pscsi_create_type_rom( + struct scsi_device *sd, + struct pscsi_dev_virt *pdv, + struct se_subsystem_dev *se_dev, + struct se_hba *hba) +{ + struct se_device *dev; + struct pscsi_hba_virt *phv = (struct pscsi_hba_virt *)pdv->pdv_se_hba->hba_ptr; + struct Scsi_Host *sh = sd->host; + u32 dev_flags = 0; + + if (scsi_device_get(sd)) { + printk(KERN_ERR "scsi_device_get() failed for %d:%d:%d:%d\n", + sh->host_no, sd->channel, sd->id, sd->lun); + spin_unlock_irq(sh->host_lock); + return NULL; + } + spin_unlock_irq(sh->host_lock); + + dev = pscsi_add_device_to_list(hba, se_dev, pdv, sd, dev_flags); + if (!(dev)) { + scsi_device_put(sd); + return NULL; + } + printk(KERN_INFO "CORE_PSCSI[%d] - Added Type: %s for %d:%d:%d:%d\n", + phv->phv_host_id, scsi_device_type(sd->type), sh->host_no, + sd->channel, sd->id, sd->lun); + + return dev; +} + +/* + *Called with struct Scsi_Host->host_lock called. + */ +static struct se_device *pscsi_create_type_other( + struct scsi_device *sd, + struct pscsi_dev_virt *pdv, + struct se_subsystem_dev *se_dev, + struct se_hba *hba) +{ + struct se_device *dev; + struct pscsi_hba_virt *phv = (struct pscsi_hba_virt *)pdv->pdv_se_hba->hba_ptr; + struct Scsi_Host *sh = sd->host; + u32 dev_flags = 0; + + spin_unlock_irq(sh->host_lock); + dev = pscsi_add_device_to_list(hba, se_dev, pdv, sd, dev_flags); + if (!(dev)) + return NULL; + + printk(KERN_INFO "CORE_PSCSI[%d] - Added Type: %s for %d:%d:%d:%d\n", + phv->phv_host_id, scsi_device_type(sd->type), sh->host_no, + sd->channel, sd->id, sd->lun); + + return dev; +} + +static struct se_device *pscsi_create_virtdevice( + struct se_hba *hba, + struct se_subsystem_dev *se_dev, + void *p) +{ + struct pscsi_dev_virt *pdv = (struct pscsi_dev_virt *)p; + struct se_device *dev; + struct scsi_device *sd; + struct pscsi_hba_virt *phv = (struct pscsi_hba_virt *)hba->hba_ptr; + struct Scsi_Host *sh = phv->phv_lld_host; + int legacy_mode_enable = 0; + + if (!(pdv)) { + printk(KERN_ERR "Unable to locate struct pscsi_dev_virt" + " parameter\n"); + return NULL; + } + /* + * If not running in PHV_LLD_SCSI_HOST_NO mode, locate the + * struct Scsi_Host we will need to bring the TCM/pSCSI object online + */ + if (!(sh)) { + if (phv->phv_mode == PHV_LLD_SCSI_HOST_NO) { + printk(KERN_ERR "pSCSI: Unable to locate struct" + " Scsi_Host for PHV_LLD_SCSI_HOST_NO\n"); + return NULL; + } + /* + * For the newer PHV_VIRUTAL_HOST_ID struct scsi_device + * reference, we enforce that udev_path has been set + */ + if (!(se_dev->su_dev_flags & SDF_USING_UDEV_PATH)) { + printk(KERN_ERR "pSCSI: udev_path attribute has not" + " been set before ENABLE=1\n"); + return NULL; + } + /* + * If no scsi_host_id= was passed for PHV_VIRUTAL_HOST_ID, + * use the original TCM hba ID to reference Linux/SCSI Host No + * and enable for PHV_LLD_SCSI_HOST_NO mode. + */ + if (!(pdv->pdv_flags & PDF_HAS_VIRT_HOST_ID)) { + spin_lock(&hba->device_lock); + if (!(list_empty(&hba->hba_dev_list))) { + printk(KERN_ERR "pSCSI: Unable to set hba_mode" + " with active devices\n"); + spin_unlock(&hba->device_lock); + return NULL; + } + spin_unlock(&hba->device_lock); + + if (pscsi_pmode_enable_hba(hba, 1) != 1) + return NULL; + + legacy_mode_enable = 1; + hba->hba_flags |= HBA_FLAGS_PSCSI_MODE; + sh = phv->phv_lld_host; + } else { + sh = pscsi_get_sh(pdv->pdv_host_id); + if (!(sh)) { + printk(KERN_ERR "pSCSI: Unable to locate" + " pdv_host_id: %d\n", pdv->pdv_host_id); + return NULL; + } + } + } else { + if (phv->phv_mode == PHV_VIRUTAL_HOST_ID) { + printk(KERN_ERR "pSCSI: PHV_VIRUTAL_HOST_ID set while" + " struct Scsi_Host exists\n"); + return NULL; + } + } + + spin_lock_irq(sh->host_lock); + list_for_each_entry(sd, &sh->__devices, siblings) { + if ((pdv->pdv_channel_id != sd->channel) || + (pdv->pdv_target_id != sd->id) || + (pdv->pdv_lun_id != sd->lun)) + continue; + /* + * Functions will release the held struct scsi_host->host_lock + * before calling calling pscsi_add_device_to_list() to register + * struct scsi_device with target_core_mod. + */ + switch (sd->type) { + case TYPE_DISK: + dev = pscsi_create_type_disk(sd, pdv, se_dev, hba); + break; + case TYPE_ROM: + dev = pscsi_create_type_rom(sd, pdv, se_dev, hba); + break; + default: + dev = pscsi_create_type_other(sd, pdv, se_dev, hba); + break; + } + + if (!(dev)) { + if (phv->phv_mode == PHV_VIRUTAL_HOST_ID) + scsi_host_put(sh); + else if (legacy_mode_enable) { + pscsi_pmode_enable_hba(hba, 0); + hba->hba_flags &= ~HBA_FLAGS_PSCSI_MODE; + } + pdv->pdv_sd = NULL; + return NULL; + } + return dev; + } + spin_unlock_irq(sh->host_lock); + + printk(KERN_ERR "pSCSI: Unable to locate %d:%d:%d:%d\n", sh->host_no, + pdv->pdv_channel_id, pdv->pdv_target_id, pdv->pdv_lun_id); + + if (phv->phv_mode == PHV_VIRUTAL_HOST_ID) + scsi_host_put(sh); + else if (legacy_mode_enable) { + pscsi_pmode_enable_hba(hba, 0); + hba->hba_flags &= ~HBA_FLAGS_PSCSI_MODE; + } + + return NULL; +} + +/* pscsi_free_device(): (Part of se_subsystem_api_t template) + * + * + */ +static void pscsi_free_device(void *p) +{ + struct pscsi_dev_virt *pdv = p; + struct pscsi_hba_virt *phv = pdv->pdv_se_hba->hba_ptr; + struct scsi_device *sd = pdv->pdv_sd; + + if (sd) { + /* + * Release exclusive pSCSI internal struct block_device claim for + * struct scsi_device with TYPE_DISK from pscsi_create_type_disk() + */ + if ((sd->type == TYPE_DISK) && pdv->pdv_bd) { + blkdev_put(pdv->pdv_bd, + FMODE_WRITE|FMODE_READ|FMODE_EXCL); + pdv->pdv_bd = NULL; + } + /* + * For HBA mode PHV_LLD_SCSI_HOST_NO, release the reference + * to struct Scsi_Host now. + */ + if ((phv->phv_mode == PHV_LLD_SCSI_HOST_NO) && + (phv->phv_lld_host != NULL)) + scsi_host_put(phv->phv_lld_host); + + if ((sd->type == TYPE_DISK) || (sd->type == TYPE_ROM)) + scsi_device_put(sd); + + pdv->pdv_sd = NULL; + } + + kfree(pdv); +} + +static inline struct pscsi_plugin_task *PSCSI_TASK(struct se_task *task) +{ + return container_of(task, struct pscsi_plugin_task, pscsi_task); +} + + +/* pscsi_transport_complete(): + * + * + */ +static int pscsi_transport_complete(struct se_task *task) +{ + struct pscsi_dev_virt *pdv = task->se_dev->dev_ptr; + struct scsi_device *sd = pdv->pdv_sd; + int result; + struct pscsi_plugin_task *pt = PSCSI_TASK(task); + unsigned char *cdb = &pt->pscsi_cdb[0]; + + result = pt->pscsi_result; + /* + * Hack to make sure that Write-Protect modepage is set if R/O mode is + * forced. + */ + if (((cdb[0] == MODE_SENSE) || (cdb[0] == MODE_SENSE_10)) && + (status_byte(result) << 1) == SAM_STAT_GOOD) { + if (!TASK_CMD(task)->se_deve) + goto after_mode_sense; + + if (TASK_CMD(task)->se_deve->lun_flags & + TRANSPORT_LUNFLAGS_READ_ONLY) { + unsigned char *buf = (unsigned char *) + T_TASK(task->task_se_cmd)->t_task_buf; + + if (cdb[0] == MODE_SENSE_10) { + if (!(buf[3] & 0x80)) + buf[3] |= 0x80; + } else { + if (!(buf[2] & 0x80)) + buf[2] |= 0x80; + } + } + } +after_mode_sense: + + if (sd->type != TYPE_TAPE) + goto after_mode_select; + + /* + * Hack to correctly obtain the initiator requested blocksize for + * TYPE_TAPE. Since this value is dependent upon each tape media, + * struct scsi_device->sector_size will not contain the correct value + * by default, so we go ahead and set it so + * TRANSPORT(dev)->get_blockdev() returns the correct value to the + * storage engine. + */ + if (((cdb[0] == MODE_SELECT) || (cdb[0] == MODE_SELECT_10)) && + (status_byte(result) << 1) == SAM_STAT_GOOD) { + unsigned char *buf; + struct scatterlist *sg = task->task_sg; + u16 bdl; + u32 blocksize; + + buf = sg_virt(&sg[0]); + if (!(buf)) { + printk(KERN_ERR "Unable to get buf for scatterlist\n"); + goto after_mode_select; + } + + if (cdb[0] == MODE_SELECT) + bdl = (buf[3]); + else + bdl = (buf[6] << 8) | (buf[7]); + + if (!bdl) + goto after_mode_select; + + if (cdb[0] == MODE_SELECT) + blocksize = (buf[9] << 16) | (buf[10] << 8) | + (buf[11]); + else + blocksize = (buf[13] << 16) | (buf[14] << 8) | + (buf[15]); + + sd->sector_size = blocksize; + } +after_mode_select: + + if (status_byte(result) & CHECK_CONDITION) + return 1; + + return 0; +} + +static struct se_task * +pscsi_alloc_task(struct se_cmd *cmd) +{ + struct pscsi_plugin_task *pt; + unsigned char *cdb = T_TASK(cmd)->t_task_cdb; + + pt = kzalloc(sizeof(struct pscsi_plugin_task), GFP_KERNEL); + if (!pt) { + printk(KERN_ERR "Unable to allocate struct pscsi_plugin_task\n"); + return NULL; + } + + /* + * If TCM Core is signaling a > TCM_MAX_COMMAND_SIZE allocation, + * allocate the extended CDB buffer for per struct se_task context + * pt->pscsi_cdb now. + */ + if (T_TASK(cmd)->t_task_cdb != T_TASK(cmd)->__t_task_cdb) { + + pt->pscsi_cdb = kzalloc(scsi_command_size(cdb), GFP_KERNEL); + if (!(pt->pscsi_cdb)) { + printk(KERN_ERR "pSCSI: Unable to allocate extended" + " pt->pscsi_cdb\n"); + return NULL; + } + } else + pt->pscsi_cdb = &pt->__pscsi_cdb[0]; + + return &pt->pscsi_task; +} + +static inline void pscsi_blk_init_request( + struct se_task *task, + struct pscsi_plugin_task *pt, + struct request *req, + int bidi_read) +{ + /* + * Defined as "scsi command" in include/linux/blkdev.h. + */ + req->cmd_type = REQ_TYPE_BLOCK_PC; + /* + * For the extra BIDI-COMMAND READ struct request we do not + * need to setup the remaining structure members + */ + if (bidi_read) + return; + /* + * Setup the done function pointer for struct request, + * also set the end_io_data pointer.to struct se_task. + */ + req->end_io = pscsi_req_done; + req->end_io_data = (void *)task; + /* + * Load the referenced struct se_task's SCSI CDB into + * include/linux/blkdev.h:struct request->cmd + */ + req->cmd_len = scsi_command_size(pt->pscsi_cdb); + req->cmd = &pt->pscsi_cdb[0]; + /* + * Setup pointer for outgoing sense data. + */ + req->sense = (void *)&pt->pscsi_sense[0]; + req->sense_len = 0; +} + +/* + * Used for pSCSI data payloads for all *NON* SCF_SCSI_DATA_SG_IO_CDB +*/ +static int pscsi_blk_get_request(struct se_task *task) +{ + struct pscsi_plugin_task *pt = PSCSI_TASK(task); + struct pscsi_dev_virt *pdv = task->se_dev->dev_ptr; + + pt->pscsi_req = blk_get_request(pdv->pdv_sd->request_queue, + (task->task_data_direction == DMA_TO_DEVICE), + GFP_KERNEL); + if (!(pt->pscsi_req) || IS_ERR(pt->pscsi_req)) { + printk(KERN_ERR "PSCSI: blk_get_request() failed: %ld\n", + IS_ERR(pt->pscsi_req)); + return PYX_TRANSPORT_LU_COMM_FAILURE; + } + /* + * Setup the newly allocated struct request for REQ_TYPE_BLOCK_PC, + * and setup rq callback, CDB and sense. + */ + pscsi_blk_init_request(task, pt, pt->pscsi_req, 0); + return 0; +} + +/* pscsi_do_task(): (Part of se_subsystem_api_t template) + * + * + */ +static int pscsi_do_task(struct se_task *task) +{ + struct pscsi_plugin_task *pt = PSCSI_TASK(task); + struct pscsi_dev_virt *pdv = task->se_dev->dev_ptr; + /* + * Set the struct request->timeout value based on peripheral + * device type from SCSI. + */ + if (pdv->pdv_sd->type == TYPE_DISK) + pt->pscsi_req->timeout = PS_TIMEOUT_DISK; + else + pt->pscsi_req->timeout = PS_TIMEOUT_OTHER; + + pt->pscsi_req->retries = PS_RETRY; + /* + * Queue the struct request into the struct scsi_device->request_queue. + * Also check for HEAD_OF_QUEUE SAM TASK attr from received se_cmd + * descriptor + */ + blk_execute_rq_nowait(pdv->pdv_sd->request_queue, NULL, pt->pscsi_req, + (task->task_se_cmd->sam_task_attr == TASK_ATTR_HOQ), + pscsi_req_done); + + return PYX_TRANSPORT_SENT_TO_TRANSPORT; +} + +static void pscsi_free_task(struct se_task *task) +{ + struct pscsi_plugin_task *pt = PSCSI_TASK(task); + struct se_cmd *cmd = task->task_se_cmd; + + /* + * Release the extended CDB allocation from pscsi_alloc_task() + * if one exists. + */ + if (T_TASK(cmd)->t_task_cdb != T_TASK(cmd)->__t_task_cdb) + kfree(pt->pscsi_cdb); + /* + * We do not release the bio(s) here associated with this task, as + * this is handled by bio_put() and pscsi_bi_endio(). + */ + kfree(pt); +} + +enum { + Opt_scsi_host_id, Opt_scsi_channel_id, Opt_scsi_target_id, + Opt_scsi_lun_id, Opt_err +}; + +static match_table_t tokens = { + {Opt_scsi_host_id, "scsi_host_id=%d"}, + {Opt_scsi_channel_id, "scsi_channel_id=%d"}, + {Opt_scsi_target_id, "scsi_target_id=%d"}, + {Opt_scsi_lun_id, "scsi_lun_id=%d"}, + {Opt_err, NULL} +}; + +static ssize_t pscsi_set_configfs_dev_params(struct se_hba *hba, + struct se_subsystem_dev *se_dev, + const char *page, + ssize_t count) +{ + struct pscsi_dev_virt *pdv = se_dev->se_dev_su_ptr; + struct pscsi_hba_virt *phv = hba->hba_ptr; + char *orig, *ptr, *opts; + substring_t args[MAX_OPT_ARGS]; + int ret = 0, arg, token; + + opts = kstrdup(page, GFP_KERNEL); + if (!opts) + return -ENOMEM; + + orig = opts; + + while ((ptr = strsep(&opts, ",")) != NULL) { + if (!*ptr) + continue; + + token = match_token(ptr, tokens, args); + switch (token) { + case Opt_scsi_host_id: + if (phv->phv_mode == PHV_LLD_SCSI_HOST_NO) { + printk(KERN_ERR "PSCSI[%d]: Unable to accept" + " scsi_host_id while phv_mode ==" + " PHV_LLD_SCSI_HOST_NO\n", + phv->phv_host_id); + ret = -EINVAL; + goto out; + } + match_int(args, &arg); + pdv->pdv_host_id = arg; + printk(KERN_INFO "PSCSI[%d]: Referencing SCSI Host ID:" + " %d\n", phv->phv_host_id, pdv->pdv_host_id); + pdv->pdv_flags |= PDF_HAS_VIRT_HOST_ID; + break; + case Opt_scsi_channel_id: + match_int(args, &arg); + pdv->pdv_channel_id = arg; + printk(KERN_INFO "PSCSI[%d]: Referencing SCSI Channel" + " ID: %d\n", phv->phv_host_id, + pdv->pdv_channel_id); + pdv->pdv_flags |= PDF_HAS_CHANNEL_ID; + break; + case Opt_scsi_target_id: + match_int(args, &arg); + pdv->pdv_target_id = arg; + printk(KERN_INFO "PSCSI[%d]: Referencing SCSI Target" + " ID: %d\n", phv->phv_host_id, + pdv->pdv_target_id); + pdv->pdv_flags |= PDF_HAS_TARGET_ID; + break; + case Opt_scsi_lun_id: + match_int(args, &arg); + pdv->pdv_lun_id = arg; + printk(KERN_INFO "PSCSI[%d]: Referencing SCSI LUN ID:" + " %d\n", phv->phv_host_id, pdv->pdv_lun_id); + pdv->pdv_flags |= PDF_HAS_LUN_ID; + break; + default: + break; + } + } + +out: + kfree(orig); + return (!ret) ? count : ret; +} + +static ssize_t pscsi_check_configfs_dev_params( + struct se_hba *hba, + struct se_subsystem_dev *se_dev) +{ + struct pscsi_dev_virt *pdv = se_dev->se_dev_su_ptr; + + if (!(pdv->pdv_flags & PDF_HAS_CHANNEL_ID) || + !(pdv->pdv_flags & PDF_HAS_TARGET_ID) || + !(pdv->pdv_flags & PDF_HAS_LUN_ID)) { + printk(KERN_ERR "Missing scsi_channel_id=, scsi_target_id= and" + " scsi_lun_id= parameters\n"); + return -1; + } + + return 0; +} + +static ssize_t pscsi_show_configfs_dev_params(struct se_hba *hba, + struct se_subsystem_dev *se_dev, + char *b) +{ + struct pscsi_hba_virt *phv = hba->hba_ptr; + struct pscsi_dev_virt *pdv = se_dev->se_dev_su_ptr; + struct scsi_device *sd = pdv->pdv_sd; + unsigned char host_id[16]; + ssize_t bl; + int i; + + if (phv->phv_mode == PHV_VIRUTAL_HOST_ID) + snprintf(host_id, 16, "%d", pdv->pdv_host_id); + else + snprintf(host_id, 16, "PHBA Mode"); + + bl = sprintf(b, "SCSI Device Bus Location:" + " Channel ID: %d Target ID: %d LUN: %d Host ID: %s\n", + pdv->pdv_channel_id, pdv->pdv_target_id, pdv->pdv_lun_id, + host_id); + + if (sd) { + bl += sprintf(b + bl, " "); + bl += sprintf(b + bl, "Vendor: "); + for (i = 0; i < 8; i++) { + if (ISPRINT(sd->vendor[i])) /* printable character? */ + bl += sprintf(b + bl, "%c", sd->vendor[i]); + else + bl += sprintf(b + bl, " "); + } + bl += sprintf(b + bl, " Model: "); + for (i = 0; i < 16; i++) { + if (ISPRINT(sd->model[i])) /* printable character ? */ + bl += sprintf(b + bl, "%c", sd->model[i]); + else + bl += sprintf(b + bl, " "); + } + bl += sprintf(b + bl, " Rev: "); + for (i = 0; i < 4; i++) { + if (ISPRINT(sd->rev[i])) /* printable character ? */ + bl += sprintf(b + bl, "%c", sd->rev[i]); + else + bl += sprintf(b + bl, " "); + } + bl += sprintf(b + bl, "\n"); + } + return bl; +} + +static void pscsi_bi_endio(struct bio *bio, int error) +{ + bio_put(bio); +} + +static inline struct bio *pscsi_get_bio(struct pscsi_dev_virt *pdv, int sg_num) +{ + struct bio *bio; + /* + * Use bio_malloc() following the comment in for bio -> struct request + * in block/blk-core.c:blk_make_request() + */ + bio = bio_kmalloc(GFP_KERNEL, sg_num); + if (!(bio)) { + printk(KERN_ERR "PSCSI: bio_kmalloc() failed\n"); + return NULL; + } + bio->bi_end_io = pscsi_bi_endio; + + return bio; +} + +#if 0 +#define DEBUG_PSCSI(x...) printk(x) +#else +#define DEBUG_PSCSI(x...) +#endif + +static int __pscsi_map_task_SG( + struct se_task *task, + struct scatterlist *task_sg, + u32 task_sg_num, + int bidi_read) +{ + struct pscsi_plugin_task *pt = PSCSI_TASK(task); + struct pscsi_dev_virt *pdv = task->se_dev->dev_ptr; + struct bio *bio = NULL, *hbio = NULL, *tbio = NULL; + struct page *page; + struct scatterlist *sg; + u32 data_len = task->task_size, i, len, bytes, off; + int nr_pages = (task->task_size + task_sg[0].offset + + PAGE_SIZE - 1) >> PAGE_SHIFT; + int nr_vecs = 0, rc, ret = PYX_TRANSPORT_OUT_OF_MEMORY_RESOURCES; + int rw = (task->task_data_direction == DMA_TO_DEVICE); + + if (!task->task_size) + return 0; + /* + * For SCF_SCSI_DATA_SG_IO_CDB, Use fs/bio.c:bio_add_page() to setup + * the bio_vec maplist from TC< struct se_mem -> task->task_sg -> + * struct scatterlist memory. The struct se_task->task_sg[] currently needs + * to be attached to struct bios for submission to Linux/SCSI using + * struct request to struct scsi_device->request_queue. + * + * Note that this will be changing post v2.6.28 as Target_Core_Mod/pSCSI + * is ported to upstream SCSI passthrough functionality that accepts + * struct scatterlist->page_link or struct page as a paraemeter. + */ + DEBUG_PSCSI("PSCSI: nr_pages: %d\n", nr_pages); + + for_each_sg(task_sg, sg, task_sg_num, i) { + page = sg_page(sg); + off = sg->offset; + len = sg->length; + + DEBUG_PSCSI("PSCSI: i: %d page: %p len: %d off: %d\n", i, + page, len, off); + + while (len > 0 && data_len > 0) { + bytes = min_t(unsigned int, len, PAGE_SIZE - off); + bytes = min(bytes, data_len); + + if (!(bio)) { + nr_vecs = min_t(int, BIO_MAX_PAGES, nr_pages); + nr_pages -= nr_vecs; + /* + * Calls bio_kmalloc() and sets bio->bi_end_io() + */ + bio = pscsi_get_bio(pdv, nr_vecs); + if (!(bio)) + goto fail; + + if (rw) + bio->bi_rw |= REQ_WRITE; + + DEBUG_PSCSI("PSCSI: Allocated bio: %p," + " dir: %s nr_vecs: %d\n", bio, + (rw) ? "rw" : "r", nr_vecs); + /* + * Set *hbio pointer to handle the case: + * nr_pages > BIO_MAX_PAGES, where additional + * bios need to be added to complete a given + * struct se_task + */ + if (!hbio) + hbio = tbio = bio; + else + tbio = tbio->bi_next = bio; + } + + DEBUG_PSCSI("PSCSI: Calling bio_add_pc_page() i: %d" + " bio: %p page: %p len: %d off: %d\n", i, bio, + page, len, off); + + rc = bio_add_pc_page(pdv->pdv_sd->request_queue, + bio, page, bytes, off); + if (rc != bytes) + goto fail; + + DEBUG_PSCSI("PSCSI: bio->bi_vcnt: %d nr_vecs: %d\n", + bio->bi_vcnt, nr_vecs); + + if (bio->bi_vcnt > nr_vecs) { + DEBUG_PSCSI("PSCSI: Reached bio->bi_vcnt max:" + " %d i: %d bio: %p, allocating another" + " bio\n", bio->bi_vcnt, i, bio); + /* + * Clear the pointer so that another bio will + * be allocated with pscsi_get_bio() above, the + * current bio has already been set *tbio and + * bio->bi_next. + */ + bio = NULL; + } + + page++; + len -= bytes; + data_len -= bytes; + off = 0; + } + } + /* + * Setup the primary pt->pscsi_req used for non BIDI and BIDI-COMMAND + * primary SCSI WRITE poayload mapped for struct se_task->task_sg[] + */ + if (!(bidi_read)) { + /* + * Starting with v2.6.31, call blk_make_request() passing in *hbio to + * allocate the pSCSI task a struct request. + */ + pt->pscsi_req = blk_make_request(pdv->pdv_sd->request_queue, + hbio, GFP_KERNEL); + if (!(pt->pscsi_req)) { + printk(KERN_ERR "pSCSI: blk_make_request() failed\n"); + goto fail; + } + /* + * Setup the newly allocated struct request for REQ_TYPE_BLOCK_PC, + * and setup rq callback, CDB and sense. + */ + pscsi_blk_init_request(task, pt, pt->pscsi_req, 0); + + return task->task_sg_num; + } + /* + * Setup the secondary pt->pscsi_req->next_rq used for the extra BIDI-COMMAND + * SCSI READ paylaod mapped for struct se_task->task_sg_bidi[] + */ + pt->pscsi_req->next_rq = blk_make_request(pdv->pdv_sd->request_queue, + hbio, GFP_KERNEL); + if (!(pt->pscsi_req->next_rq)) { + printk(KERN_ERR "pSCSI: blk_make_request() failed for BIDI\n"); + goto fail; + } + pscsi_blk_init_request(task, pt, pt->pscsi_req->next_rq, 1); + + return task->task_sg_num; +fail: + while (hbio) { + bio = hbio; + hbio = hbio->bi_next; + bio->bi_next = NULL; + bio_endio(bio, 0); + } + return ret; +} + +static int pscsi_map_task_SG(struct se_task *task) +{ + int ret; + + /* + * Setup the main struct request for the task->task_sg[] payload + */ + + ret = __pscsi_map_task_SG(task, task->task_sg, task->task_sg_num, 0); + if (ret >= 0 && task->task_sg_bidi) { + /* + * If present, set up the extra BIDI-COMMAND SCSI READ + * struct request and payload. + */ + ret = __pscsi_map_task_SG(task, task->task_sg_bidi, + task->task_sg_num, 1); + } + + if (ret < 0) + return PYX_TRANSPORT_LU_COMM_FAILURE; + return 0; +} + +/* pscsi_map_task_non_SG(): + * + * + */ +static int pscsi_map_task_non_SG(struct se_task *task) +{ + struct se_cmd *cmd = TASK_CMD(task); + struct pscsi_plugin_task *pt = PSCSI_TASK(task); + struct pscsi_dev_virt *pdv = task->se_dev->dev_ptr; + int ret = 0; + + if (pscsi_blk_get_request(task) < 0) + return PYX_TRANSPORT_LU_COMM_FAILURE; + + if (!task->task_size) + return 0; + + ret = blk_rq_map_kern(pdv->pdv_sd->request_queue, + pt->pscsi_req, T_TASK(cmd)->t_task_buf, + task->task_size, GFP_KERNEL); + if (ret < 0) { + printk(KERN_ERR "PSCSI: blk_rq_map_kern() failed: %d\n", ret); + return PYX_TRANSPORT_LU_COMM_FAILURE; + } + return 0; +} + +static int pscsi_CDB_none(struct se_task *task) +{ + return pscsi_blk_get_request(task); +} + +/* pscsi_get_cdb(): + * + * + */ +static unsigned char *pscsi_get_cdb(struct se_task *task) +{ + struct pscsi_plugin_task *pt = PSCSI_TASK(task); + + return pt->pscsi_cdb; +} + +/* pscsi_get_sense_buffer(): + * + * + */ +static unsigned char *pscsi_get_sense_buffer(struct se_task *task) +{ + struct pscsi_plugin_task *pt = PSCSI_TASK(task); + + return (unsigned char *)&pt->pscsi_sense[0]; +} + +/* pscsi_get_device_rev(): + * + * + */ +static u32 pscsi_get_device_rev(struct se_device *dev) +{ + struct pscsi_dev_virt *pdv = dev->dev_ptr; + struct scsi_device *sd = pdv->pdv_sd; + + return (sd->scsi_level - 1) ? sd->scsi_level - 1 : 1; +} + +/* pscsi_get_device_type(): + * + * + */ +static u32 pscsi_get_device_type(struct se_device *dev) +{ + struct pscsi_dev_virt *pdv = dev->dev_ptr; + struct scsi_device *sd = pdv->pdv_sd; + + return sd->type; +} + +static sector_t pscsi_get_blocks(struct se_device *dev) +{ + struct pscsi_dev_virt *pdv = dev->dev_ptr; + + if (pdv->pdv_bd && pdv->pdv_bd->bd_part) + return pdv->pdv_bd->bd_part->nr_sects; + + dump_stack(); + return 0; +} + +/* pscsi_handle_SAM_STATUS_failures(): + * + * + */ +static inline void pscsi_process_SAM_status( + struct se_task *task, + struct pscsi_plugin_task *pt) +{ + task->task_scsi_status = status_byte(pt->pscsi_result); + if ((task->task_scsi_status)) { + task->task_scsi_status <<= 1; + printk(KERN_INFO "PSCSI Status Byte exception at task: %p CDB:" + " 0x%02x Result: 0x%08x\n", task, pt->pscsi_cdb[0], + pt->pscsi_result); + } + + switch (host_byte(pt->pscsi_result)) { + case DID_OK: + transport_complete_task(task, (!task->task_scsi_status)); + break; + default: + printk(KERN_INFO "PSCSI Host Byte exception at task: %p CDB:" + " 0x%02x Result: 0x%08x\n", task, pt->pscsi_cdb[0], + pt->pscsi_result); + task->task_scsi_status = SAM_STAT_CHECK_CONDITION; + task->task_error_status = PYX_TRANSPORT_UNKNOWN_SAM_OPCODE; + TASK_CMD(task)->transport_error_status = + PYX_TRANSPORT_UNKNOWN_SAM_OPCODE; + transport_complete_task(task, 0); + break; + } + + return; +} + +static void pscsi_req_done(struct request *req, int uptodate) +{ + struct se_task *task = req->end_io_data; + struct pscsi_plugin_task *pt = PSCSI_TASK(task); + + pt->pscsi_result = req->errors; + pt->pscsi_resid = req->resid_len; + + pscsi_process_SAM_status(task, pt); + /* + * Release BIDI-READ if present + */ + if (req->next_rq != NULL) + __blk_put_request(req->q, req->next_rq); + + __blk_put_request(req->q, req); + pt->pscsi_req = NULL; +} + +static struct se_subsystem_api pscsi_template = { + .name = "pscsi", + .owner = THIS_MODULE, + .transport_type = TRANSPORT_PLUGIN_PHBA_PDEV, + .cdb_none = pscsi_CDB_none, + .map_task_non_SG = pscsi_map_task_non_SG, + .map_task_SG = pscsi_map_task_SG, + .attach_hba = pscsi_attach_hba, + .detach_hba = pscsi_detach_hba, + .pmode_enable_hba = pscsi_pmode_enable_hba, + .allocate_virtdevice = pscsi_allocate_virtdevice, + .create_virtdevice = pscsi_create_virtdevice, + .free_device = pscsi_free_device, + .transport_complete = pscsi_transport_complete, + .alloc_task = pscsi_alloc_task, + .do_task = pscsi_do_task, + .free_task = pscsi_free_task, + .check_configfs_dev_params = pscsi_check_configfs_dev_params, + .set_configfs_dev_params = pscsi_set_configfs_dev_params, + .show_configfs_dev_params = pscsi_show_configfs_dev_params, + .get_cdb = pscsi_get_cdb, + .get_sense_buffer = pscsi_get_sense_buffer, + .get_device_rev = pscsi_get_device_rev, + .get_device_type = pscsi_get_device_type, + .get_blocks = pscsi_get_blocks, +}; + +static int __init pscsi_module_init(void) +{ + return transport_subsystem_register(&pscsi_template); +} + +static void pscsi_module_exit(void) +{ + transport_subsystem_release(&pscsi_template); +} + +MODULE_DESCRIPTION("TCM PSCSI subsystem plugin"); +MODULE_AUTHOR("nab@Linux-iSCSI.org"); +MODULE_LICENSE("GPL"); + +module_init(pscsi_module_init); +module_exit(pscsi_module_exit); diff --git a/drivers/target/target_core_pscsi.h b/drivers/target/target_core_pscsi.h new file mode 100644 index 000000000000..a4cd5d352c3a --- /dev/null +++ b/drivers/target/target_core_pscsi.h @@ -0,0 +1,65 @@ +#ifndef TARGET_CORE_PSCSI_H +#define TARGET_CORE_PSCSI_H + +#define PSCSI_VERSION "v4.0" +#define PSCSI_VIRTUAL_HBA_DEPTH 2048 + +/* used in pscsi_find_alloc_len() */ +#ifndef INQUIRY_DATA_SIZE +#define INQUIRY_DATA_SIZE 0x24 +#endif + +/* used in pscsi_add_device_to_list() */ +#define PSCSI_DEFAULT_QUEUEDEPTH 1 + +#define PS_RETRY 5 +#define PS_TIMEOUT_DISK (15*HZ) +#define PS_TIMEOUT_OTHER (500*HZ) + +#include +#include +#include +#include +#include + +struct pscsi_plugin_task { + struct se_task pscsi_task; + unsigned char *pscsi_cdb; + unsigned char __pscsi_cdb[TCM_MAX_COMMAND_SIZE]; + unsigned char pscsi_sense[SCSI_SENSE_BUFFERSIZE]; + int pscsi_direction; + int pscsi_result; + u32 pscsi_resid; + struct request *pscsi_req; +} ____cacheline_aligned; + +#define PDF_HAS_CHANNEL_ID 0x01 +#define PDF_HAS_TARGET_ID 0x02 +#define PDF_HAS_LUN_ID 0x04 +#define PDF_HAS_VPD_UNIT_SERIAL 0x08 +#define PDF_HAS_VPD_DEV_IDENT 0x10 +#define PDF_HAS_VIRT_HOST_ID 0x20 + +struct pscsi_dev_virt { + int pdv_flags; + int pdv_host_id; + int pdv_channel_id; + int pdv_target_id; + int pdv_lun_id; + struct block_device *pdv_bd; + struct scsi_device *pdv_sd; + struct se_hba *pdv_se_hba; +} ____cacheline_aligned; + +typedef enum phv_modes { + PHV_VIRUTAL_HOST_ID, + PHV_LLD_SCSI_HOST_NO +} phv_modes_t; + +struct pscsi_hba_virt { + int phv_host_id; + phv_modes_t phv_mode; + struct Scsi_Host *phv_lld_host; +} ____cacheline_aligned; + +#endif /*** TARGET_CORE_PSCSI_H ***/ diff --git a/drivers/target/target_core_rd.c b/drivers/target/target_core_rd.c new file mode 100644 index 000000000000..979aebf20019 --- /dev/null +++ b/drivers/target/target_core_rd.c @@ -0,0 +1,1091 @@ +/******************************************************************************* + * Filename: target_core_rd.c + * + * This file contains the Storage Engine <-> Ramdisk transport + * specific functions. + * + * Copyright (c) 2003, 2004, 2005 PyX Technologies, Inc. + * Copyright (c) 2005, 2006, 2007 SBE, Inc. + * Copyright (c) 2007-2010 Rising Tide Systems + * Copyright (c) 2008-2010 Linux-iSCSI.org + * + * Nicholas A. Bellinger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + ******************************************************************************/ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include + +#include "target_core_rd.h" + +static struct se_subsystem_api rd_dr_template; +static struct se_subsystem_api rd_mcp_template; + +/* #define DEBUG_RAMDISK_MCP */ +/* #define DEBUG_RAMDISK_DR */ + +/* rd_attach_hba(): (Part of se_subsystem_api_t template) + * + * + */ +static int rd_attach_hba(struct se_hba *hba, u32 host_id) +{ + struct rd_host *rd_host; + + rd_host = kzalloc(sizeof(struct rd_host), GFP_KERNEL); + if (!(rd_host)) { + printk(KERN_ERR "Unable to allocate memory for struct rd_host\n"); + return -ENOMEM; + } + + rd_host->rd_host_id = host_id; + + atomic_set(&hba->left_queue_depth, RD_HBA_QUEUE_DEPTH); + atomic_set(&hba->max_queue_depth, RD_HBA_QUEUE_DEPTH); + hba->hba_ptr = (void *) rd_host; + + printk(KERN_INFO "CORE_HBA[%d] - TCM Ramdisk HBA Driver %s on" + " Generic Target Core Stack %s\n", hba->hba_id, + RD_HBA_VERSION, TARGET_CORE_MOD_VERSION); + printk(KERN_INFO "CORE_HBA[%d] - Attached Ramdisk HBA: %u to Generic" + " Target Core TCQ Depth: %d MaxSectors: %u\n", hba->hba_id, + rd_host->rd_host_id, atomic_read(&hba->max_queue_depth), + RD_MAX_SECTORS); + + return 0; +} + +static void rd_detach_hba(struct se_hba *hba) +{ + struct rd_host *rd_host = hba->hba_ptr; + + printk(KERN_INFO "CORE_HBA[%d] - Detached Ramdisk HBA: %u from" + " Generic Target Core\n", hba->hba_id, rd_host->rd_host_id); + + kfree(rd_host); + hba->hba_ptr = NULL; +} + +/* rd_release_device_space(): + * + * + */ +static void rd_release_device_space(struct rd_dev *rd_dev) +{ + u32 i, j, page_count = 0, sg_per_table; + struct rd_dev_sg_table *sg_table; + struct page *pg; + struct scatterlist *sg; + + if (!rd_dev->sg_table_array || !rd_dev->sg_table_count) + return; + + sg_table = rd_dev->sg_table_array; + + for (i = 0; i < rd_dev->sg_table_count; i++) { + sg = sg_table[i].sg_table; + sg_per_table = sg_table[i].rd_sg_count; + + for (j = 0; j < sg_per_table; j++) { + pg = sg_page(&sg[j]); + if ((pg)) { + __free_page(pg); + page_count++; + } + } + + kfree(sg); + } + + printk(KERN_INFO "CORE_RD[%u] - Released device space for Ramdisk" + " Device ID: %u, pages %u in %u tables total bytes %lu\n", + rd_dev->rd_host->rd_host_id, rd_dev->rd_dev_id, page_count, + rd_dev->sg_table_count, (unsigned long)page_count * PAGE_SIZE); + + kfree(sg_table); + rd_dev->sg_table_array = NULL; + rd_dev->sg_table_count = 0; +} + + +/* rd_build_device_space(): + * + * + */ +static int rd_build_device_space(struct rd_dev *rd_dev) +{ + u32 i = 0, j, page_offset = 0, sg_per_table, sg_tables, total_sg_needed; + u32 max_sg_per_table = (RD_MAX_ALLOCATION_SIZE / + sizeof(struct scatterlist)); + struct rd_dev_sg_table *sg_table; + struct page *pg; + struct scatterlist *sg; + + if (rd_dev->rd_page_count <= 0) { + printk(KERN_ERR "Illegal page count: %u for Ramdisk device\n", + rd_dev->rd_page_count); + return -1; + } + total_sg_needed = rd_dev->rd_page_count; + + sg_tables = (total_sg_needed / max_sg_per_table) + 1; + + sg_table = kzalloc(sg_tables * sizeof(struct rd_dev_sg_table), GFP_KERNEL); + if (!(sg_table)) { + printk(KERN_ERR "Unable to allocate memory for Ramdisk" + " scatterlist tables\n"); + return -1; + } + + rd_dev->sg_table_array = sg_table; + rd_dev->sg_table_count = sg_tables; + + while (total_sg_needed) { + sg_per_table = (total_sg_needed > max_sg_per_table) ? + max_sg_per_table : total_sg_needed; + + sg = kzalloc(sg_per_table * sizeof(struct scatterlist), + GFP_KERNEL); + if (!(sg)) { + printk(KERN_ERR "Unable to allocate scatterlist array" + " for struct rd_dev\n"); + return -1; + } + + sg_init_table((struct scatterlist *)&sg[0], sg_per_table); + + sg_table[i].sg_table = sg; + sg_table[i].rd_sg_count = sg_per_table; + sg_table[i].page_start_offset = page_offset; + sg_table[i++].page_end_offset = (page_offset + sg_per_table) + - 1; + + for (j = 0; j < sg_per_table; j++) { + pg = alloc_pages(GFP_KERNEL, 0); + if (!(pg)) { + printk(KERN_ERR "Unable to allocate scatterlist" + " pages for struct rd_dev_sg_table\n"); + return -1; + } + sg_assign_page(&sg[j], pg); + sg[j].length = PAGE_SIZE; + } + + page_offset += sg_per_table; + total_sg_needed -= sg_per_table; + } + + printk(KERN_INFO "CORE_RD[%u] - Built Ramdisk Device ID: %u space of" + " %u pages in %u tables\n", rd_dev->rd_host->rd_host_id, + rd_dev->rd_dev_id, rd_dev->rd_page_count, + rd_dev->sg_table_count); + + return 0; +} + +static void *rd_allocate_virtdevice( + struct se_hba *hba, + const char *name, + int rd_direct) +{ + struct rd_dev *rd_dev; + struct rd_host *rd_host = hba->hba_ptr; + + rd_dev = kzalloc(sizeof(struct rd_dev), GFP_KERNEL); + if (!(rd_dev)) { + printk(KERN_ERR "Unable to allocate memory for struct rd_dev\n"); + return NULL; + } + + rd_dev->rd_host = rd_host; + rd_dev->rd_direct = rd_direct; + + return rd_dev; +} + +static void *rd_DIRECT_allocate_virtdevice(struct se_hba *hba, const char *name) +{ + return rd_allocate_virtdevice(hba, name, 1); +} + +static void *rd_MEMCPY_allocate_virtdevice(struct se_hba *hba, const char *name) +{ + return rd_allocate_virtdevice(hba, name, 0); +} + +/* rd_create_virtdevice(): + * + * + */ +static struct se_device *rd_create_virtdevice( + struct se_hba *hba, + struct se_subsystem_dev *se_dev, + void *p, + int rd_direct) +{ + struct se_device *dev; + struct se_dev_limits dev_limits; + struct rd_dev *rd_dev = p; + struct rd_host *rd_host = hba->hba_ptr; + int dev_flags = 0; + char prod[16], rev[4]; + + memset(&dev_limits, 0, sizeof(struct se_dev_limits)); + + if (rd_build_device_space(rd_dev) < 0) + goto fail; + + snprintf(prod, 16, "RAMDISK-%s", (rd_dev->rd_direct) ? "DR" : "MCP"); + snprintf(rev, 4, "%s", (rd_dev->rd_direct) ? RD_DR_VERSION : + RD_MCP_VERSION); + + dev_limits.limits.logical_block_size = RD_BLOCKSIZE; + dev_limits.limits.max_hw_sectors = RD_MAX_SECTORS; + dev_limits.limits.max_sectors = RD_MAX_SECTORS; + dev_limits.hw_queue_depth = RD_MAX_DEVICE_QUEUE_DEPTH; + dev_limits.queue_depth = RD_DEVICE_QUEUE_DEPTH; + + dev = transport_add_device_to_core_hba(hba, + (rd_dev->rd_direct) ? &rd_dr_template : + &rd_mcp_template, se_dev, dev_flags, (void *)rd_dev, + &dev_limits, prod, rev); + if (!(dev)) + goto fail; + + rd_dev->rd_dev_id = rd_host->rd_host_dev_id_count++; + rd_dev->rd_queue_depth = dev->queue_depth; + + printk(KERN_INFO "CORE_RD[%u] - Added TCM %s Ramdisk Device ID: %u of" + " %u pages in %u tables, %lu total bytes\n", + rd_host->rd_host_id, (!rd_dev->rd_direct) ? "MEMCPY" : + "DIRECT", rd_dev->rd_dev_id, rd_dev->rd_page_count, + rd_dev->sg_table_count, + (unsigned long)(rd_dev->rd_page_count * PAGE_SIZE)); + + return dev; + +fail: + rd_release_device_space(rd_dev); + return NULL; +} + +static struct se_device *rd_DIRECT_create_virtdevice( + struct se_hba *hba, + struct se_subsystem_dev *se_dev, + void *p) +{ + return rd_create_virtdevice(hba, se_dev, p, 1); +} + +static struct se_device *rd_MEMCPY_create_virtdevice( + struct se_hba *hba, + struct se_subsystem_dev *se_dev, + void *p) +{ + return rd_create_virtdevice(hba, se_dev, p, 0); +} + +/* rd_free_device(): (Part of se_subsystem_api_t template) + * + * + */ +static void rd_free_device(void *p) +{ + struct rd_dev *rd_dev = p; + + rd_release_device_space(rd_dev); + kfree(rd_dev); +} + +static inline struct rd_request *RD_REQ(struct se_task *task) +{ + return container_of(task, struct rd_request, rd_task); +} + +static struct se_task * +rd_alloc_task(struct se_cmd *cmd) +{ + struct rd_request *rd_req; + + rd_req = kzalloc(sizeof(struct rd_request), GFP_KERNEL); + if (!rd_req) { + printk(KERN_ERR "Unable to allocate struct rd_request\n"); + return NULL; + } + rd_req->rd_dev = SE_DEV(cmd)->dev_ptr; + + return &rd_req->rd_task; +} + +/* rd_get_sg_table(): + * + * + */ +static struct rd_dev_sg_table *rd_get_sg_table(struct rd_dev *rd_dev, u32 page) +{ + u32 i; + struct rd_dev_sg_table *sg_table; + + for (i = 0; i < rd_dev->sg_table_count; i++) { + sg_table = &rd_dev->sg_table_array[i]; + if ((sg_table->page_start_offset <= page) && + (sg_table->page_end_offset >= page)) + return sg_table; + } + + printk(KERN_ERR "Unable to locate struct rd_dev_sg_table for page: %u\n", + page); + + return NULL; +} + +/* rd_MEMCPY_read(): + * + * + */ +static int rd_MEMCPY_read(struct rd_request *req) +{ + struct se_task *task = &req->rd_task; + struct rd_dev *dev = req->rd_dev; + struct rd_dev_sg_table *table; + struct scatterlist *sg_d, *sg_s; + void *dst, *src; + u32 i = 0, j = 0, dst_offset = 0, src_offset = 0; + u32 length, page_end = 0, table_sg_end; + u32 rd_offset = req->rd_offset; + + table = rd_get_sg_table(dev, req->rd_page); + if (!(table)) + return -1; + + table_sg_end = (table->page_end_offset - req->rd_page); + sg_d = task->task_sg; + sg_s = &table->sg_table[req->rd_page - table->page_start_offset]; +#ifdef DEBUG_RAMDISK_MCP + printk(KERN_INFO "RD[%u]: Read LBA: %llu, Size: %u Page: %u, Offset:" + " %u\n", dev->rd_dev_id, task->task_lba, req->rd_size, + req->rd_page, req->rd_offset); +#endif + src_offset = rd_offset; + + while (req->rd_size) { + if ((sg_d[i].length - dst_offset) < + (sg_s[j].length - src_offset)) { + length = (sg_d[i].length - dst_offset); +#ifdef DEBUG_RAMDISK_MCP + printk(KERN_INFO "Step 1 - sg_d[%d]: %p length: %d" + " offset: %u sg_s[%d].length: %u\n", i, + &sg_d[i], sg_d[i].length, sg_d[i].offset, j, + sg_s[j].length); + printk(KERN_INFO "Step 1 - length: %u dst_offset: %u" + " src_offset: %u\n", length, dst_offset, + src_offset); +#endif + if (length > req->rd_size) + length = req->rd_size; + + dst = sg_virt(&sg_d[i++]) + dst_offset; + if (!dst) + BUG(); + + src = sg_virt(&sg_s[j]) + src_offset; + if (!src) + BUG(); + + dst_offset = 0; + src_offset = length; + page_end = 0; + } else { + length = (sg_s[j].length - src_offset); +#ifdef DEBUG_RAMDISK_MCP + printk(KERN_INFO "Step 2 - sg_d[%d]: %p length: %d" + " offset: %u sg_s[%d].length: %u\n", i, + &sg_d[i], sg_d[i].length, sg_d[i].offset, + j, sg_s[j].length); + printk(KERN_INFO "Step 2 - length: %u dst_offset: %u" + " src_offset: %u\n", length, dst_offset, + src_offset); +#endif + if (length > req->rd_size) + length = req->rd_size; + + dst = sg_virt(&sg_d[i]) + dst_offset; + if (!dst) + BUG(); + + if (sg_d[i].length == length) { + i++; + dst_offset = 0; + } else + dst_offset = length; + + src = sg_virt(&sg_s[j++]) + src_offset; + if (!src) + BUG(); + + src_offset = 0; + page_end = 1; + } + + memcpy(dst, src, length); + +#ifdef DEBUG_RAMDISK_MCP + printk(KERN_INFO "page: %u, remaining size: %u, length: %u," + " i: %u, j: %u\n", req->rd_page, + (req->rd_size - length), length, i, j); +#endif + req->rd_size -= length; + if (!(req->rd_size)) + return 0; + + if (!page_end) + continue; + + if (++req->rd_page <= table->page_end_offset) { +#ifdef DEBUG_RAMDISK_MCP + printk(KERN_INFO "page: %u in same page table\n", + req->rd_page); +#endif + continue; + } +#ifdef DEBUG_RAMDISK_MCP + printk(KERN_INFO "getting new page table for page: %u\n", + req->rd_page); +#endif + table = rd_get_sg_table(dev, req->rd_page); + if (!(table)) + return -1; + + sg_s = &table->sg_table[j = 0]; + } + + return 0; +} + +/* rd_MEMCPY_write(): + * + * + */ +static int rd_MEMCPY_write(struct rd_request *req) +{ + struct se_task *task = &req->rd_task; + struct rd_dev *dev = req->rd_dev; + struct rd_dev_sg_table *table; + struct scatterlist *sg_d, *sg_s; + void *dst, *src; + u32 i = 0, j = 0, dst_offset = 0, src_offset = 0; + u32 length, page_end = 0, table_sg_end; + u32 rd_offset = req->rd_offset; + + table = rd_get_sg_table(dev, req->rd_page); + if (!(table)) + return -1; + + table_sg_end = (table->page_end_offset - req->rd_page); + sg_d = &table->sg_table[req->rd_page - table->page_start_offset]; + sg_s = task->task_sg; +#ifdef DEBUG_RAMDISK_MCP + printk(KERN_INFO "RD[%d] Write LBA: %llu, Size: %u, Page: %u," + " Offset: %u\n", dev->rd_dev_id, task->task_lba, req->rd_size, + req->rd_page, req->rd_offset); +#endif + dst_offset = rd_offset; + + while (req->rd_size) { + if ((sg_s[i].length - src_offset) < + (sg_d[j].length - dst_offset)) { + length = (sg_s[i].length - src_offset); +#ifdef DEBUG_RAMDISK_MCP + printk(KERN_INFO "Step 1 - sg_s[%d]: %p length: %d" + " offset: %d sg_d[%d].length: %u\n", i, + &sg_s[i], sg_s[i].length, sg_s[i].offset, + j, sg_d[j].length); + printk(KERN_INFO "Step 1 - length: %u src_offset: %u" + " dst_offset: %u\n", length, src_offset, + dst_offset); +#endif + if (length > req->rd_size) + length = req->rd_size; + + src = sg_virt(&sg_s[i++]) + src_offset; + if (!src) + BUG(); + + dst = sg_virt(&sg_d[j]) + dst_offset; + if (!dst) + BUG(); + + src_offset = 0; + dst_offset = length; + page_end = 0; + } else { + length = (sg_d[j].length - dst_offset); +#ifdef DEBUG_RAMDISK_MCP + printk(KERN_INFO "Step 2 - sg_s[%d]: %p length: %d" + " offset: %d sg_d[%d].length: %u\n", i, + &sg_s[i], sg_s[i].length, sg_s[i].offset, + j, sg_d[j].length); + printk(KERN_INFO "Step 2 - length: %u src_offset: %u" + " dst_offset: %u\n", length, src_offset, + dst_offset); +#endif + if (length > req->rd_size) + length = req->rd_size; + + src = sg_virt(&sg_s[i]) + src_offset; + if (!src) + BUG(); + + if (sg_s[i].length == length) { + i++; + src_offset = 0; + } else + src_offset = length; + + dst = sg_virt(&sg_d[j++]) + dst_offset; + if (!dst) + BUG(); + + dst_offset = 0; + page_end = 1; + } + + memcpy(dst, src, length); + +#ifdef DEBUG_RAMDISK_MCP + printk(KERN_INFO "page: %u, remaining size: %u, length: %u," + " i: %u, j: %u\n", req->rd_page, + (req->rd_size - length), length, i, j); +#endif + req->rd_size -= length; + if (!(req->rd_size)) + return 0; + + if (!page_end) + continue; + + if (++req->rd_page <= table->page_end_offset) { +#ifdef DEBUG_RAMDISK_MCP + printk(KERN_INFO "page: %u in same page table\n", + req->rd_page); +#endif + continue; + } +#ifdef DEBUG_RAMDISK_MCP + printk(KERN_INFO "getting new page table for page: %u\n", + req->rd_page); +#endif + table = rd_get_sg_table(dev, req->rd_page); + if (!(table)) + return -1; + + sg_d = &table->sg_table[j = 0]; + } + + return 0; +} + +/* rd_MEMCPY_do_task(): (Part of se_subsystem_api_t template) + * + * + */ +static int rd_MEMCPY_do_task(struct se_task *task) +{ + struct se_device *dev = task->se_dev; + struct rd_request *req = RD_REQ(task); + unsigned long long lba; + int ret; + + req->rd_page = (task->task_lba * DEV_ATTRIB(dev)->block_size) / PAGE_SIZE; + lba = task->task_lba; + req->rd_offset = (do_div(lba, + (PAGE_SIZE / DEV_ATTRIB(dev)->block_size))) * + DEV_ATTRIB(dev)->block_size; + req->rd_size = task->task_size; + + if (task->task_data_direction == DMA_FROM_DEVICE) + ret = rd_MEMCPY_read(req); + else + ret = rd_MEMCPY_write(req); + + if (ret != 0) + return ret; + + task->task_scsi_status = GOOD; + transport_complete_task(task, 1); + + return PYX_TRANSPORT_SENT_TO_TRANSPORT; +} + +/* rd_DIRECT_with_offset(): + * + * + */ +static int rd_DIRECT_with_offset( + struct se_task *task, + struct list_head *se_mem_list, + u32 *se_mem_cnt, + u32 *task_offset) +{ + struct rd_request *req = RD_REQ(task); + struct rd_dev *dev = req->rd_dev; + struct rd_dev_sg_table *table; + struct se_mem *se_mem; + struct scatterlist *sg_s; + u32 j = 0, set_offset = 1; + u32 get_next_table = 0, offset_length, table_sg_end; + + table = rd_get_sg_table(dev, req->rd_page); + if (!(table)) + return -1; + + table_sg_end = (table->page_end_offset - req->rd_page); + sg_s = &table->sg_table[req->rd_page - table->page_start_offset]; +#ifdef DEBUG_RAMDISK_DR + printk(KERN_INFO "%s DIRECT LBA: %llu, Size: %u Page: %u, Offset: %u\n", + (task->task_data_direction == DMA_TO_DEVICE) ? + "Write" : "Read", + task->task_lba, req->rd_size, req->rd_page, req->rd_offset); +#endif + while (req->rd_size) { + se_mem = kmem_cache_zalloc(se_mem_cache, GFP_KERNEL); + if (!(se_mem)) { + printk(KERN_ERR "Unable to allocate struct se_mem\n"); + return -1; + } + INIT_LIST_HEAD(&se_mem->se_list); + + if (set_offset) { + offset_length = sg_s[j].length - req->rd_offset; + if (offset_length > req->rd_size) + offset_length = req->rd_size; + + se_mem->se_page = sg_page(&sg_s[j++]); + se_mem->se_off = req->rd_offset; + se_mem->se_len = offset_length; + + set_offset = 0; + get_next_table = (j > table_sg_end); + goto check_eot; + } + + offset_length = (req->rd_size < req->rd_offset) ? + req->rd_size : req->rd_offset; + + se_mem->se_page = sg_page(&sg_s[j]); + se_mem->se_len = offset_length; + + set_offset = 1; + +check_eot: +#ifdef DEBUG_RAMDISK_DR + printk(KERN_INFO "page: %u, size: %u, offset_length: %u, j: %u" + " se_mem: %p, se_page: %p se_off: %u se_len: %u\n", + req->rd_page, req->rd_size, offset_length, j, se_mem, + se_mem->se_page, se_mem->se_off, se_mem->se_len); +#endif + list_add_tail(&se_mem->se_list, se_mem_list); + (*se_mem_cnt)++; + + req->rd_size -= offset_length; + if (!(req->rd_size)) + goto out; + + if (!set_offset && !get_next_table) + continue; + + if (++req->rd_page <= table->page_end_offset) { +#ifdef DEBUG_RAMDISK_DR + printk(KERN_INFO "page: %u in same page table\n", + req->rd_page); +#endif + continue; + } +#ifdef DEBUG_RAMDISK_DR + printk(KERN_INFO "getting new page table for page: %u\n", + req->rd_page); +#endif + table = rd_get_sg_table(dev, req->rd_page); + if (!(table)) + return -1; + + sg_s = &table->sg_table[j = 0]; + } + +out: + T_TASK(task->task_se_cmd)->t_tasks_se_num += *se_mem_cnt; +#ifdef DEBUG_RAMDISK_DR + printk(KERN_INFO "RD_DR - Allocated %u struct se_mem segments for task\n", + *se_mem_cnt); +#endif + return 0; +} + +/* rd_DIRECT_without_offset(): + * + * + */ +static int rd_DIRECT_without_offset( + struct se_task *task, + struct list_head *se_mem_list, + u32 *se_mem_cnt, + u32 *task_offset) +{ + struct rd_request *req = RD_REQ(task); + struct rd_dev *dev = req->rd_dev; + struct rd_dev_sg_table *table; + struct se_mem *se_mem; + struct scatterlist *sg_s; + u32 length, j = 0; + + table = rd_get_sg_table(dev, req->rd_page); + if (!(table)) + return -1; + + sg_s = &table->sg_table[req->rd_page - table->page_start_offset]; +#ifdef DEBUG_RAMDISK_DR + printk(KERN_INFO "%s DIRECT LBA: %llu, Size: %u, Page: %u\n", + (task->task_data_direction == DMA_TO_DEVICE) ? + "Write" : "Read", + task->task_lba, req->rd_size, req->rd_page); +#endif + while (req->rd_size) { + se_mem = kmem_cache_zalloc(se_mem_cache, GFP_KERNEL); + if (!(se_mem)) { + printk(KERN_ERR "Unable to allocate struct se_mem\n"); + return -1; + } + INIT_LIST_HEAD(&se_mem->se_list); + + length = (req->rd_size < sg_s[j].length) ? + req->rd_size : sg_s[j].length; + + se_mem->se_page = sg_page(&sg_s[j++]); + se_mem->se_len = length; + +#ifdef DEBUG_RAMDISK_DR + printk(KERN_INFO "page: %u, size: %u, j: %u se_mem: %p," + " se_page: %p se_off: %u se_len: %u\n", req->rd_page, + req->rd_size, j, se_mem, se_mem->se_page, + se_mem->se_off, se_mem->se_len); +#endif + list_add_tail(&se_mem->se_list, se_mem_list); + (*se_mem_cnt)++; + + req->rd_size -= length; + if (!(req->rd_size)) + goto out; + + if (++req->rd_page <= table->page_end_offset) { +#ifdef DEBUG_RAMDISK_DR + printk("page: %u in same page table\n", + req->rd_page); +#endif + continue; + } +#ifdef DEBUG_RAMDISK_DR + printk(KERN_INFO "getting new page table for page: %u\n", + req->rd_page); +#endif + table = rd_get_sg_table(dev, req->rd_page); + if (!(table)) + return -1; + + sg_s = &table->sg_table[j = 0]; + } + +out: + T_TASK(task->task_se_cmd)->t_tasks_se_num += *se_mem_cnt; +#ifdef DEBUG_RAMDISK_DR + printk(KERN_INFO "RD_DR - Allocated %u struct se_mem segments for task\n", + *se_mem_cnt); +#endif + return 0; +} + +/* rd_DIRECT_do_se_mem_map(): + * + * + */ +static int rd_DIRECT_do_se_mem_map( + struct se_task *task, + struct list_head *se_mem_list, + void *in_mem, + struct se_mem *in_se_mem, + struct se_mem **out_se_mem, + u32 *se_mem_cnt, + u32 *task_offset_in) +{ + struct se_cmd *cmd = task->task_se_cmd; + struct rd_request *req = RD_REQ(task); + u32 task_offset = *task_offset_in; + unsigned long long lba; + int ret; + + req->rd_page = ((task->task_lba * DEV_ATTRIB(task->se_dev)->block_size) / + PAGE_SIZE); + lba = task->task_lba; + req->rd_offset = (do_div(lba, + (PAGE_SIZE / DEV_ATTRIB(task->se_dev)->block_size))) * + DEV_ATTRIB(task->se_dev)->block_size; + req->rd_size = task->task_size; + + if (req->rd_offset) + ret = rd_DIRECT_with_offset(task, se_mem_list, se_mem_cnt, + task_offset_in); + else + ret = rd_DIRECT_without_offset(task, se_mem_list, se_mem_cnt, + task_offset_in); + + if (ret < 0) + return ret; + + if (CMD_TFO(cmd)->task_sg_chaining == 0) + return 0; + /* + * Currently prevent writers from multiple HW fabrics doing + * pci_map_sg() to RD_DR's internal scatterlist memory. + */ + if (cmd->data_direction == DMA_TO_DEVICE) { + printk(KERN_ERR "DMA_TO_DEVICE not supported for" + " RAMDISK_DR with task_sg_chaining=1\n"); + return -1; + } + /* + * Special case for if task_sg_chaining is enabled, then + * we setup struct se_task->task_sg[], as it will be used by + * transport_do_task_sg_chain() for creating chainged SGLs + * across multiple struct se_task->task_sg[]. + */ + if (!(transport_calc_sg_num(task, + list_entry(T_TASK(cmd)->t_mem_list->next, + struct se_mem, se_list), + task_offset))) + return -1; + + return transport_map_mem_to_sg(task, se_mem_list, task->task_sg, + list_entry(T_TASK(cmd)->t_mem_list->next, + struct se_mem, se_list), + out_se_mem, se_mem_cnt, task_offset_in); +} + +/* rd_DIRECT_do_task(): (Part of se_subsystem_api_t template) + * + * + */ +static int rd_DIRECT_do_task(struct se_task *task) +{ + /* + * At this point the locally allocated RD tables have been mapped + * to struct se_mem elements in rd_DIRECT_do_se_mem_map(). + */ + task->task_scsi_status = GOOD; + transport_complete_task(task, 1); + + return PYX_TRANSPORT_SENT_TO_TRANSPORT; +} + +/* rd_free_task(): (Part of se_subsystem_api_t template) + * + * + */ +static void rd_free_task(struct se_task *task) +{ + kfree(RD_REQ(task)); +} + +enum { + Opt_rd_pages, Opt_err +}; + +static match_table_t tokens = { + {Opt_rd_pages, "rd_pages=%d"}, + {Opt_err, NULL} +}; + +static ssize_t rd_set_configfs_dev_params( + struct se_hba *hba, + struct se_subsystem_dev *se_dev, + const char *page, + ssize_t count) +{ + struct rd_dev *rd_dev = se_dev->se_dev_su_ptr; + char *orig, *ptr, *opts; + substring_t args[MAX_OPT_ARGS]; + int ret = 0, arg, token; + + opts = kstrdup(page, GFP_KERNEL); + if (!opts) + return -ENOMEM; + + orig = opts; + + while ((ptr = strsep(&opts, ",")) != NULL) { + if (!*ptr) + continue; + + token = match_token(ptr, tokens, args); + switch (token) { + case Opt_rd_pages: + match_int(args, &arg); + rd_dev->rd_page_count = arg; + printk(KERN_INFO "RAMDISK: Referencing Page" + " Count: %u\n", rd_dev->rd_page_count); + rd_dev->rd_flags |= RDF_HAS_PAGE_COUNT; + break; + default: + break; + } + } + + kfree(orig); + return (!ret) ? count : ret; +} + +static ssize_t rd_check_configfs_dev_params(struct se_hba *hba, struct se_subsystem_dev *se_dev) +{ + struct rd_dev *rd_dev = se_dev->se_dev_su_ptr; + + if (!(rd_dev->rd_flags & RDF_HAS_PAGE_COUNT)) { + printk(KERN_INFO "Missing rd_pages= parameter\n"); + return -1; + } + + return 0; +} + +static ssize_t rd_show_configfs_dev_params( + struct se_hba *hba, + struct se_subsystem_dev *se_dev, + char *b) +{ + struct rd_dev *rd_dev = se_dev->se_dev_su_ptr; + ssize_t bl = sprintf(b, "TCM RamDisk ID: %u RamDisk Makeup: %s\n", + rd_dev->rd_dev_id, (rd_dev->rd_direct) ? + "rd_direct" : "rd_mcp"); + bl += sprintf(b + bl, " PAGES/PAGE_SIZE: %u*%lu" + " SG_table_count: %u\n", rd_dev->rd_page_count, + PAGE_SIZE, rd_dev->sg_table_count); + return bl; +} + +/* rd_get_cdb(): (Part of se_subsystem_api_t template) + * + * + */ +static unsigned char *rd_get_cdb(struct se_task *task) +{ + struct rd_request *req = RD_REQ(task); + + return req->rd_scsi_cdb; +} + +static u32 rd_get_device_rev(struct se_device *dev) +{ + return SCSI_SPC_2; /* Returns SPC-3 in Initiator Data */ +} + +static u32 rd_get_device_type(struct se_device *dev) +{ + return TYPE_DISK; +} + +static sector_t rd_get_blocks(struct se_device *dev) +{ + struct rd_dev *rd_dev = dev->dev_ptr; + unsigned long long blocks_long = ((rd_dev->rd_page_count * PAGE_SIZE) / + DEV_ATTRIB(dev)->block_size) - 1; + + return blocks_long; +} + +static struct se_subsystem_api rd_dr_template = { + .name = "rd_dr", + .transport_type = TRANSPORT_PLUGIN_VHBA_VDEV, + .attach_hba = rd_attach_hba, + .detach_hba = rd_detach_hba, + .allocate_virtdevice = rd_DIRECT_allocate_virtdevice, + .create_virtdevice = rd_DIRECT_create_virtdevice, + .free_device = rd_free_device, + .alloc_task = rd_alloc_task, + .do_task = rd_DIRECT_do_task, + .free_task = rd_free_task, + .check_configfs_dev_params = rd_check_configfs_dev_params, + .set_configfs_dev_params = rd_set_configfs_dev_params, + .show_configfs_dev_params = rd_show_configfs_dev_params, + .get_cdb = rd_get_cdb, + .get_device_rev = rd_get_device_rev, + .get_device_type = rd_get_device_type, + .get_blocks = rd_get_blocks, + .do_se_mem_map = rd_DIRECT_do_se_mem_map, +}; + +static struct se_subsystem_api rd_mcp_template = { + .name = "rd_mcp", + .transport_type = TRANSPORT_PLUGIN_VHBA_VDEV, + .attach_hba = rd_attach_hba, + .detach_hba = rd_detach_hba, + .allocate_virtdevice = rd_MEMCPY_allocate_virtdevice, + .create_virtdevice = rd_MEMCPY_create_virtdevice, + .free_device = rd_free_device, + .alloc_task = rd_alloc_task, + .do_task = rd_MEMCPY_do_task, + .free_task = rd_free_task, + .check_configfs_dev_params = rd_check_configfs_dev_params, + .set_configfs_dev_params = rd_set_configfs_dev_params, + .show_configfs_dev_params = rd_show_configfs_dev_params, + .get_cdb = rd_get_cdb, + .get_device_rev = rd_get_device_rev, + .get_device_type = rd_get_device_type, + .get_blocks = rd_get_blocks, +}; + +int __init rd_module_init(void) +{ + int ret; + + ret = transport_subsystem_register(&rd_dr_template); + if (ret < 0) + return ret; + + ret = transport_subsystem_register(&rd_mcp_template); + if (ret < 0) { + transport_subsystem_release(&rd_dr_template); + return ret; + } + + return 0; +} + +void rd_module_exit(void) +{ + transport_subsystem_release(&rd_dr_template); + transport_subsystem_release(&rd_mcp_template); +} diff --git a/drivers/target/target_core_rd.h b/drivers/target/target_core_rd.h new file mode 100644 index 000000000000..13badfbaf9c0 --- /dev/null +++ b/drivers/target/target_core_rd.h @@ -0,0 +1,73 @@ +#ifndef TARGET_CORE_RD_H +#define TARGET_CORE_RD_H + +#define RD_HBA_VERSION "v4.0" +#define RD_DR_VERSION "4.0" +#define RD_MCP_VERSION "4.0" + +/* Largest piece of memory kmalloc can allocate */ +#define RD_MAX_ALLOCATION_SIZE 65536 +/* Maximum queuedepth for the Ramdisk HBA */ +#define RD_HBA_QUEUE_DEPTH 256 +#define RD_DEVICE_QUEUE_DEPTH 32 +#define RD_MAX_DEVICE_QUEUE_DEPTH 128 +#define RD_BLOCKSIZE 512 +#define RD_MAX_SECTORS 1024 + +extern struct kmem_cache *se_mem_cache; + +/* Used in target_core_init_configfs() for virtual LUN 0 access */ +int __init rd_module_init(void); +void rd_module_exit(void); + +#define RRF_EMULATE_CDB 0x01 +#define RRF_GOT_LBA 0x02 + +struct rd_request { + struct se_task rd_task; + + /* SCSI CDB from iSCSI Command PDU */ + unsigned char rd_scsi_cdb[TCM_MAX_COMMAND_SIZE]; + /* Offset from start of page */ + u32 rd_offset; + /* Starting page in Ramdisk for request */ + u32 rd_page; + /* Total number of pages needed for request */ + u32 rd_page_count; + /* Scatterlist count */ + u32 rd_size; + /* Ramdisk device */ + struct rd_dev *rd_dev; +} ____cacheline_aligned; + +struct rd_dev_sg_table { + u32 page_start_offset; + u32 page_end_offset; + u32 rd_sg_count; + struct scatterlist *sg_table; +} ____cacheline_aligned; + +#define RDF_HAS_PAGE_COUNT 0x01 + +struct rd_dev { + int rd_direct; + u32 rd_flags; + /* Unique Ramdisk Device ID in Ramdisk HBA */ + u32 rd_dev_id; + /* Total page count for ramdisk device */ + u32 rd_page_count; + /* Number of SG tables in sg_table_array */ + u32 sg_table_count; + u32 rd_queue_depth; + /* Array of rd_dev_sg_table_t containing scatterlists */ + struct rd_dev_sg_table *sg_table_array; + /* Ramdisk HBA device is connected to */ + struct rd_host *rd_host; +} ____cacheline_aligned; + +struct rd_host { + u32 rd_host_dev_id_count; + u32 rd_host_id; /* Unique Ramdisk Host ID */ +} ____cacheline_aligned; + +#endif /* TARGET_CORE_RD_H */ diff --git a/drivers/target/target_core_scdb.c b/drivers/target/target_core_scdb.c new file mode 100644 index 000000000000..dc6fed037ab3 --- /dev/null +++ b/drivers/target/target_core_scdb.c @@ -0,0 +1,105 @@ +/******************************************************************************* + * Filename: target_core_scdb.c + * + * This file contains the generic target engine Split CDB related functions. + * + * Copyright (c) 2004-2005 PyX Technologies, Inc. + * Copyright (c) 2005, 2006, 2007 SBE, Inc. + * Copyright (c) 2007-2010 Rising Tide Systems + * Copyright (c) 2008-2010 Linux-iSCSI.org + * + * Nicholas A. Bellinger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + ******************************************************************************/ + +#include +#include +#include +#include + +#include +#include + +#include "target_core_scdb.h" + +/* split_cdb_XX_6(): + * + * 21-bit LBA w/ 8-bit SECTORS + */ +void split_cdb_XX_6( + unsigned long long lba, + u32 *sectors, + unsigned char *cdb) +{ + cdb[1] = (lba >> 16) & 0x1f; + cdb[2] = (lba >> 8) & 0xff; + cdb[3] = lba & 0xff; + cdb[4] = *sectors & 0xff; +} + +/* split_cdb_XX_10(): + * + * 32-bit LBA w/ 16-bit SECTORS + */ +void split_cdb_XX_10( + unsigned long long lba, + u32 *sectors, + unsigned char *cdb) +{ + put_unaligned_be32(lba, &cdb[2]); + put_unaligned_be16(*sectors, &cdb[7]); +} + +/* split_cdb_XX_12(): + * + * 32-bit LBA w/ 32-bit SECTORS + */ +void split_cdb_XX_12( + unsigned long long lba, + u32 *sectors, + unsigned char *cdb) +{ + put_unaligned_be32(lba, &cdb[2]); + put_unaligned_be32(*sectors, &cdb[6]); +} + +/* split_cdb_XX_16(): + * + * 64-bit LBA w/ 32-bit SECTORS + */ +void split_cdb_XX_16( + unsigned long long lba, + u32 *sectors, + unsigned char *cdb) +{ + put_unaligned_be64(lba, &cdb[2]); + put_unaligned_be32(*sectors, &cdb[10]); +} + +/* + * split_cdb_XX_32(): + * + * 64-bit LBA w/ 32-bit SECTORS such as READ_32, WRITE_32 and emulated XDWRITEREAD_32 + */ +void split_cdb_XX_32( + unsigned long long lba, + u32 *sectors, + unsigned char *cdb) +{ + put_unaligned_be64(lba, &cdb[12]); + put_unaligned_be32(*sectors, &cdb[28]); +} diff --git a/drivers/target/target_core_scdb.h b/drivers/target/target_core_scdb.h new file mode 100644 index 000000000000..98cd1c01ed83 --- /dev/null +++ b/drivers/target/target_core_scdb.h @@ -0,0 +1,10 @@ +#ifndef TARGET_CORE_SCDB_H +#define TARGET_CORE_SCDB_H + +extern void split_cdb_XX_6(unsigned long long, u32 *, unsigned char *); +extern void split_cdb_XX_10(unsigned long long, u32 *, unsigned char *); +extern void split_cdb_XX_12(unsigned long long, u32 *, unsigned char *); +extern void split_cdb_XX_16(unsigned long long, u32 *, unsigned char *); +extern void split_cdb_XX_32(unsigned long long, u32 *, unsigned char *); + +#endif /* TARGET_CORE_SCDB_H */ diff --git a/drivers/target/target_core_tmr.c b/drivers/target/target_core_tmr.c new file mode 100644 index 000000000000..158cecbec718 --- /dev/null +++ b/drivers/target/target_core_tmr.c @@ -0,0 +1,404 @@ +/******************************************************************************* + * Filename: target_core_tmr.c + * + * This file contains SPC-3 task management infrastructure + * + * Copyright (c) 2009,2010 Rising Tide Systems + * Copyright (c) 2009,2010 Linux-iSCSI.org + * + * Nicholas A. Bellinger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + ******************************************************************************/ + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include + +#include "target_core_alua.h" +#include "target_core_pr.h" + +#define DEBUG_LUN_RESET +#ifdef DEBUG_LUN_RESET +#define DEBUG_LR(x...) printk(KERN_INFO x) +#else +#define DEBUG_LR(x...) +#endif + +struct se_tmr_req *core_tmr_alloc_req( + struct se_cmd *se_cmd, + void *fabric_tmr_ptr, + u8 function) +{ + struct se_tmr_req *tmr; + + tmr = kmem_cache_zalloc(se_tmr_req_cache, GFP_KERNEL); + if (!(tmr)) { + printk(KERN_ERR "Unable to allocate struct se_tmr_req\n"); + return ERR_PTR(-ENOMEM); + } + tmr->task_cmd = se_cmd; + tmr->fabric_tmr_ptr = fabric_tmr_ptr; + tmr->function = function; + INIT_LIST_HEAD(&tmr->tmr_list); + + return tmr; +} +EXPORT_SYMBOL(core_tmr_alloc_req); + +void core_tmr_release_req( + struct se_tmr_req *tmr) +{ + struct se_device *dev = tmr->tmr_dev; + + spin_lock(&dev->se_tmr_lock); + list_del(&tmr->tmr_list); + kmem_cache_free(se_tmr_req_cache, tmr); + spin_unlock(&dev->se_tmr_lock); +} + +static void core_tmr_handle_tas_abort( + struct se_node_acl *tmr_nacl, + struct se_cmd *cmd, + int tas, + int fe_count) +{ + if (!(fe_count)) { + transport_cmd_finish_abort(cmd, 1); + return; + } + /* + * TASK ABORTED status (TAS) bit support + */ + if (((tmr_nacl != NULL) && + (tmr_nacl == cmd->se_sess->se_node_acl)) || tas) + transport_send_task_abort(cmd); + + transport_cmd_finish_abort(cmd, 0); +} + +int core_tmr_lun_reset( + struct se_device *dev, + struct se_tmr_req *tmr, + struct list_head *preempt_and_abort_list, + struct se_cmd *prout_cmd) +{ + struct se_cmd *cmd; + struct se_queue_req *qr, *qr_tmp; + struct se_node_acl *tmr_nacl = NULL; + struct se_portal_group *tmr_tpg = NULL; + struct se_queue_obj *qobj = dev->dev_queue_obj; + struct se_tmr_req *tmr_p, *tmr_pp; + struct se_task *task, *task_tmp; + unsigned long flags; + int fe_count, state, tas; + /* + * TASK_ABORTED status bit, this is configurable via ConfigFS + * struct se_device attributes. spc4r17 section 7.4.6 Control mode page + * + * A task aborted status (TAS) bit set to zero specifies that aborted + * tasks shall be terminated by the device server without any response + * to the application client. A TAS bit set to one specifies that tasks + * aborted by the actions of an I_T nexus other than the I_T nexus on + * which the command was received shall be completed with TASK ABORTED + * status (see SAM-4). + */ + tas = DEV_ATTRIB(dev)->emulate_tas; + /* + * Determine if this se_tmr is coming from a $FABRIC_MOD + * or struct se_device passthrough.. + */ + if (tmr && tmr->task_cmd && tmr->task_cmd->se_sess) { + tmr_nacl = tmr->task_cmd->se_sess->se_node_acl; + tmr_tpg = tmr->task_cmd->se_sess->se_tpg; + if (tmr_nacl && tmr_tpg) { + DEBUG_LR("LUN_RESET: TMR caller fabric: %s" + " initiator port %s\n", + TPG_TFO(tmr_tpg)->get_fabric_name(), + tmr_nacl->initiatorname); + } + } + DEBUG_LR("LUN_RESET: %s starting for [%s], tas: %d\n", + (preempt_and_abort_list) ? "Preempt" : "TMR", + TRANSPORT(dev)->name, tas); + /* + * Release all pending and outgoing TMRs aside from the received + * LUN_RESET tmr.. + */ + spin_lock(&dev->se_tmr_lock); + list_for_each_entry_safe(tmr_p, tmr_pp, &dev->dev_tmr_list, tmr_list) { + /* + * Allow the received TMR to return with FUNCTION_COMPLETE. + */ + if (tmr && (tmr_p == tmr)) + continue; + + cmd = tmr_p->task_cmd; + if (!(cmd)) { + printk(KERN_ERR "Unable to locate struct se_cmd for TMR\n"); + continue; + } + /* + * If this function was called with a valid pr_res_key + * parameter (eg: for PROUT PREEMPT_AND_ABORT service action + * skip non regisration key matching TMRs. + */ + if ((preempt_and_abort_list != NULL) && + (core_scsi3_check_cdb_abort_and_preempt( + preempt_and_abort_list, cmd) != 0)) + continue; + spin_unlock(&dev->se_tmr_lock); + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + if (!(atomic_read(&T_TASK(cmd)->t_transport_active))) { + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + spin_lock(&dev->se_tmr_lock); + continue; + } + if (cmd->t_state == TRANSPORT_ISTATE_PROCESSING) { + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + spin_lock(&dev->se_tmr_lock); + continue; + } + DEBUG_LR("LUN_RESET: %s releasing TMR %p Function: 0x%02x," + " Response: 0x%02x, t_state: %d\n", + (preempt_and_abort_list) ? "Preempt" : "", tmr_p, + tmr_p->function, tmr_p->response, cmd->t_state); + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + + transport_cmd_finish_abort_tmr(cmd); + spin_lock(&dev->se_tmr_lock); + } + spin_unlock(&dev->se_tmr_lock); + /* + * Complete outstanding struct se_task CDBs with TASK_ABORTED SAM status. + * This is following sam4r17, section 5.6 Aborting commands, Table 38 + * for TMR LUN_RESET: + * + * a) "Yes" indicates that each command that is aborted on an I_T nexus + * other than the one that caused the SCSI device condition is + * completed with TASK ABORTED status, if the TAS bit is set to one in + * the Control mode page (see SPC-4). "No" indicates that no status is + * returned for aborted commands. + * + * d) If the logical unit reset is caused by a particular I_T nexus + * (e.g., by a LOGICAL UNIT RESET task management function), then "yes" + * (TASK_ABORTED status) applies. + * + * Otherwise (e.g., if triggered by a hard reset), "no" + * (no TASK_ABORTED SAM status) applies. + * + * Note that this seems to be independent of TAS (Task Aborted Status) + * in the Control Mode Page. + */ + spin_lock_irqsave(&dev->execute_task_lock, flags); + list_for_each_entry_safe(task, task_tmp, &dev->state_task_list, + t_state_list) { + if (!(TASK_CMD(task))) { + printk(KERN_ERR "TASK_CMD(task) is NULL!\n"); + continue; + } + cmd = TASK_CMD(task); + + if (!T_TASK(cmd)) { + printk(KERN_ERR "T_TASK(cmd) is NULL for task: %p cmd:" + " %p ITT: 0x%08x\n", task, cmd, + CMD_TFO(cmd)->get_task_tag(cmd)); + continue; + } + /* + * For PREEMPT_AND_ABORT usage, only process commands + * with a matching reservation key. + */ + if ((preempt_and_abort_list != NULL) && + (core_scsi3_check_cdb_abort_and_preempt( + preempt_and_abort_list, cmd) != 0)) + continue; + /* + * Not aborting PROUT PREEMPT_AND_ABORT CDB.. + */ + if (prout_cmd == cmd) + continue; + + list_del(&task->t_state_list); + atomic_set(&task->task_state_active, 0); + spin_unlock_irqrestore(&dev->execute_task_lock, flags); + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + DEBUG_LR("LUN_RESET: %s cmd: %p task: %p" + " ITT/CmdSN: 0x%08x/0x%08x, i_state: %d, t_state/" + "def_t_state: %d/%d cdb: 0x%02x\n", + (preempt_and_abort_list) ? "Preempt" : "", cmd, task, + CMD_TFO(cmd)->get_task_tag(cmd), 0, + CMD_TFO(cmd)->get_cmd_state(cmd), cmd->t_state, + cmd->deferred_t_state, T_TASK(cmd)->t_task_cdb[0]); + DEBUG_LR("LUN_RESET: ITT[0x%08x] - pr_res_key: 0x%016Lx" + " t_task_cdbs: %d t_task_cdbs_left: %d" + " t_task_cdbs_sent: %d -- t_transport_active: %d" + " t_transport_stop: %d t_transport_sent: %d\n", + CMD_TFO(cmd)->get_task_tag(cmd), cmd->pr_res_key, + T_TASK(cmd)->t_task_cdbs, + atomic_read(&T_TASK(cmd)->t_task_cdbs_left), + atomic_read(&T_TASK(cmd)->t_task_cdbs_sent), + atomic_read(&T_TASK(cmd)->t_transport_active), + atomic_read(&T_TASK(cmd)->t_transport_stop), + atomic_read(&T_TASK(cmd)->t_transport_sent)); + + if (atomic_read(&task->task_active)) { + atomic_set(&task->task_stop, 1); + spin_unlock_irqrestore( + &T_TASK(cmd)->t_state_lock, flags); + + DEBUG_LR("LUN_RESET: Waiting for task: %p to shutdown" + " for dev: %p\n", task, dev); + wait_for_completion(&task->task_stop_comp); + DEBUG_LR("LUN_RESET Completed task: %p shutdown for" + " dev: %p\n", task, dev); + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + atomic_dec(&T_TASK(cmd)->t_task_cdbs_left); + + atomic_set(&task->task_active, 0); + atomic_set(&task->task_stop, 0); + } + __transport_stop_task_timer(task, &flags); + + if (!(atomic_dec_and_test(&T_TASK(cmd)->t_task_cdbs_ex_left))) { + spin_unlock_irqrestore( + &T_TASK(cmd)->t_state_lock, flags); + DEBUG_LR("LUN_RESET: Skipping task: %p, dev: %p for" + " t_task_cdbs_ex_left: %d\n", task, dev, + atomic_read(&T_TASK(cmd)->t_task_cdbs_ex_left)); + + spin_lock_irqsave(&dev->execute_task_lock, flags); + continue; + } + fe_count = atomic_read(&T_TASK(cmd)->t_fe_count); + + if (atomic_read(&T_TASK(cmd)->t_transport_active)) { + DEBUG_LR("LUN_RESET: got t_transport_active = 1 for" + " task: %p, t_fe_count: %d dev: %p\n", task, + fe_count, dev); + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, + flags); + core_tmr_handle_tas_abort(tmr_nacl, cmd, tas, fe_count); + + spin_lock_irqsave(&dev->execute_task_lock, flags); + continue; + } + DEBUG_LR("LUN_RESET: Got t_transport_active = 0 for task: %p," + " t_fe_count: %d dev: %p\n", task, fe_count, dev); + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + core_tmr_handle_tas_abort(tmr_nacl, cmd, tas, fe_count); + + spin_lock_irqsave(&dev->execute_task_lock, flags); + } + spin_unlock_irqrestore(&dev->execute_task_lock, flags); + /* + * Release all commands remaining in the struct se_device cmd queue. + * + * This follows the same logic as above for the struct se_device + * struct se_task state list, where commands are returned with + * TASK_ABORTED status, if there is an outstanding $FABRIC_MOD + * reference, otherwise the struct se_cmd is released. + */ + spin_lock_irqsave(&qobj->cmd_queue_lock, flags); + list_for_each_entry_safe(qr, qr_tmp, &qobj->qobj_list, qr_list) { + cmd = (struct se_cmd *)qr->cmd; + if (!(cmd)) { + /* + * Skip these for non PREEMPT_AND_ABORT usage.. + */ + if (preempt_and_abort_list != NULL) + continue; + + atomic_dec(&qobj->queue_cnt); + list_del(&qr->qr_list); + kfree(qr); + continue; + } + /* + * For PREEMPT_AND_ABORT usage, only process commands + * with a matching reservation key. + */ + if ((preempt_and_abort_list != NULL) && + (core_scsi3_check_cdb_abort_and_preempt( + preempt_and_abort_list, cmd) != 0)) + continue; + /* + * Not aborting PROUT PREEMPT_AND_ABORT CDB.. + */ + if (prout_cmd == cmd) + continue; + + atomic_dec(&T_TASK(cmd)->t_transport_queue_active); + atomic_dec(&qobj->queue_cnt); + list_del(&qr->qr_list); + spin_unlock_irqrestore(&qobj->cmd_queue_lock, flags); + + state = qr->state; + kfree(qr); + + DEBUG_LR("LUN_RESET: %s from Device Queue: cmd: %p t_state:" + " %d t_fe_count: %d\n", (preempt_and_abort_list) ? + "Preempt" : "", cmd, state, + atomic_read(&T_TASK(cmd)->t_fe_count)); + /* + * Signal that the command has failed via cmd->se_cmd_flags, + * and call TFO->new_cmd_failure() to wakeup any fabric + * dependent code used to wait for unsolicited data out + * allocation to complete. The fabric module is expected + * to dump any remaining unsolicited data out for the aborted + * command at this point. + */ + transport_new_cmd_failure(cmd); + + core_tmr_handle_tas_abort(tmr_nacl, cmd, tas, + atomic_read(&T_TASK(cmd)->t_fe_count)); + spin_lock_irqsave(&qobj->cmd_queue_lock, flags); + } + spin_unlock_irqrestore(&qobj->cmd_queue_lock, flags); + /* + * Clear any legacy SPC-2 reservation when called during + * LOGICAL UNIT RESET + */ + if (!(preempt_and_abort_list) && + (dev->dev_flags & DF_SPC2_RESERVATIONS)) { + spin_lock(&dev->dev_reservation_lock); + dev->dev_reserved_node_acl = NULL; + dev->dev_flags &= ~DF_SPC2_RESERVATIONS; + spin_unlock(&dev->dev_reservation_lock); + printk(KERN_INFO "LUN_RESET: SCSI-2 Released reservation\n"); + } + + spin_lock(&dev->stats_lock); + dev->num_resets++; + spin_unlock(&dev->stats_lock); + + DEBUG_LR("LUN_RESET: %s for [%s] Complete\n", + (preempt_and_abort_list) ? "Preempt" : "TMR", + TRANSPORT(dev)->name); + return 0; +} diff --git a/drivers/target/target_core_tpg.c b/drivers/target/target_core_tpg.c new file mode 100644 index 000000000000..abfa81a57115 --- /dev/null +++ b/drivers/target/target_core_tpg.c @@ -0,0 +1,826 @@ +/******************************************************************************* + * Filename: target_core_tpg.c + * + * This file contains generic Target Portal Group related functions. + * + * Copyright (c) 2002, 2003, 2004, 2005 PyX Technologies, Inc. + * Copyright (c) 2005, 2006, 2007 SBE, Inc. + * Copyright (c) 2007-2010 Rising Tide Systems + * Copyright (c) 2008-2010 Linux-iSCSI.org + * + * Nicholas A. Bellinger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + ******************************************************************************/ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "target_core_hba.h" + +/* core_clear_initiator_node_from_tpg(): + * + * + */ +static void core_clear_initiator_node_from_tpg( + struct se_node_acl *nacl, + struct se_portal_group *tpg) +{ + int i; + struct se_dev_entry *deve; + struct se_lun *lun; + struct se_lun_acl *acl, *acl_tmp; + + spin_lock_irq(&nacl->device_list_lock); + for (i = 0; i < TRANSPORT_MAX_LUNS_PER_TPG; i++) { + deve = &nacl->device_list[i]; + + if (!(deve->lun_flags & TRANSPORT_LUNFLAGS_INITIATOR_ACCESS)) + continue; + + if (!deve->se_lun) { + printk(KERN_ERR "%s device entries device pointer is" + " NULL, but Initiator has access.\n", + TPG_TFO(tpg)->get_fabric_name()); + continue; + } + + lun = deve->se_lun; + spin_unlock_irq(&nacl->device_list_lock); + core_update_device_list_for_node(lun, NULL, deve->mapped_lun, + TRANSPORT_LUNFLAGS_NO_ACCESS, nacl, tpg, 0); + + spin_lock(&lun->lun_acl_lock); + list_for_each_entry_safe(acl, acl_tmp, + &lun->lun_acl_list, lacl_list) { + if (!(strcmp(acl->initiatorname, + nacl->initiatorname)) && + (acl->mapped_lun == deve->mapped_lun)) + break; + } + + if (!acl) { + printk(KERN_ERR "Unable to locate struct se_lun_acl for %s," + " mapped_lun: %u\n", nacl->initiatorname, + deve->mapped_lun); + spin_unlock(&lun->lun_acl_lock); + spin_lock_irq(&nacl->device_list_lock); + continue; + } + + list_del(&acl->lacl_list); + spin_unlock(&lun->lun_acl_lock); + + spin_lock_irq(&nacl->device_list_lock); + kfree(acl); + } + spin_unlock_irq(&nacl->device_list_lock); +} + +/* __core_tpg_get_initiator_node_acl(): + * + * spin_lock_bh(&tpg->acl_node_lock); must be held when calling + */ +struct se_node_acl *__core_tpg_get_initiator_node_acl( + struct se_portal_group *tpg, + const char *initiatorname) +{ + struct se_node_acl *acl; + + list_for_each_entry(acl, &tpg->acl_node_list, acl_list) { + if (!(strcmp(acl->initiatorname, initiatorname))) + return acl; + } + + return NULL; +} + +/* core_tpg_get_initiator_node_acl(): + * + * + */ +struct se_node_acl *core_tpg_get_initiator_node_acl( + struct se_portal_group *tpg, + unsigned char *initiatorname) +{ + struct se_node_acl *acl; + + spin_lock_bh(&tpg->acl_node_lock); + list_for_each_entry(acl, &tpg->acl_node_list, acl_list) { + if (!(strcmp(acl->initiatorname, initiatorname)) && + (!(acl->dynamic_node_acl))) { + spin_unlock_bh(&tpg->acl_node_lock); + return acl; + } + } + spin_unlock_bh(&tpg->acl_node_lock); + + return NULL; +} + +/* core_tpg_add_node_to_devs(): + * + * + */ +void core_tpg_add_node_to_devs( + struct se_node_acl *acl, + struct se_portal_group *tpg) +{ + int i = 0; + u32 lun_access = 0; + struct se_lun *lun; + struct se_device *dev; + + spin_lock(&tpg->tpg_lun_lock); + for (i = 0; i < TRANSPORT_MAX_LUNS_PER_TPG; i++) { + lun = &tpg->tpg_lun_list[i]; + if (lun->lun_status != TRANSPORT_LUN_STATUS_ACTIVE) + continue; + + spin_unlock(&tpg->tpg_lun_lock); + + dev = lun->lun_se_dev; + /* + * By default in LIO-Target $FABRIC_MOD, + * demo_mode_write_protect is ON, or READ_ONLY; + */ + if (!(TPG_TFO(tpg)->tpg_check_demo_mode_write_protect(tpg))) { + if (dev->dev_flags & DF_READ_ONLY) + lun_access = TRANSPORT_LUNFLAGS_READ_ONLY; + else + lun_access = TRANSPORT_LUNFLAGS_READ_WRITE; + } else { + /* + * Allow only optical drives to issue R/W in default RO + * demo mode. + */ + if (TRANSPORT(dev)->get_device_type(dev) == TYPE_DISK) + lun_access = TRANSPORT_LUNFLAGS_READ_ONLY; + else + lun_access = TRANSPORT_LUNFLAGS_READ_WRITE; + } + + printk(KERN_INFO "TARGET_CORE[%s]->TPG[%u]_LUN[%u] - Adding %s" + " access for LUN in Demo Mode\n", + TPG_TFO(tpg)->get_fabric_name(), + TPG_TFO(tpg)->tpg_get_tag(tpg), lun->unpacked_lun, + (lun_access == TRANSPORT_LUNFLAGS_READ_WRITE) ? + "READ-WRITE" : "READ-ONLY"); + + core_update_device_list_for_node(lun, NULL, lun->unpacked_lun, + lun_access, acl, tpg, 1); + spin_lock(&tpg->tpg_lun_lock); + } + spin_unlock(&tpg->tpg_lun_lock); +} + +/* core_set_queue_depth_for_node(): + * + * + */ +static int core_set_queue_depth_for_node( + struct se_portal_group *tpg, + struct se_node_acl *acl) +{ + if (!acl->queue_depth) { + printk(KERN_ERR "Queue depth for %s Initiator Node: %s is 0," + "defaulting to 1.\n", TPG_TFO(tpg)->get_fabric_name(), + acl->initiatorname); + acl->queue_depth = 1; + } + + return 0; +} + +/* core_create_device_list_for_node(): + * + * + */ +static int core_create_device_list_for_node(struct se_node_acl *nacl) +{ + struct se_dev_entry *deve; + int i; + + nacl->device_list = kzalloc(sizeof(struct se_dev_entry) * + TRANSPORT_MAX_LUNS_PER_TPG, GFP_KERNEL); + if (!(nacl->device_list)) { + printk(KERN_ERR "Unable to allocate memory for" + " struct se_node_acl->device_list\n"); + return -1; + } + for (i = 0; i < TRANSPORT_MAX_LUNS_PER_TPG; i++) { + deve = &nacl->device_list[i]; + + atomic_set(&deve->ua_count, 0); + atomic_set(&deve->pr_ref_count, 0); + spin_lock_init(&deve->ua_lock); + INIT_LIST_HEAD(&deve->alua_port_list); + INIT_LIST_HEAD(&deve->ua_list); + } + + return 0; +} + +/* core_tpg_check_initiator_node_acl() + * + * + */ +struct se_node_acl *core_tpg_check_initiator_node_acl( + struct se_portal_group *tpg, + unsigned char *initiatorname) +{ + struct se_node_acl *acl; + + acl = core_tpg_get_initiator_node_acl(tpg, initiatorname); + if ((acl)) + return acl; + + if (!(TPG_TFO(tpg)->tpg_check_demo_mode(tpg))) + return NULL; + + acl = TPG_TFO(tpg)->tpg_alloc_fabric_acl(tpg); + if (!(acl)) + return NULL; + + INIT_LIST_HEAD(&acl->acl_list); + INIT_LIST_HEAD(&acl->acl_sess_list); + spin_lock_init(&acl->device_list_lock); + spin_lock_init(&acl->nacl_sess_lock); + atomic_set(&acl->acl_pr_ref_count, 0); + atomic_set(&acl->mib_ref_count, 0); + acl->queue_depth = TPG_TFO(tpg)->tpg_get_default_depth(tpg); + snprintf(acl->initiatorname, TRANSPORT_IQN_LEN, "%s", initiatorname); + acl->se_tpg = tpg; + acl->acl_index = scsi_get_new_index(SCSI_AUTH_INTR_INDEX); + spin_lock_init(&acl->stats_lock); + acl->dynamic_node_acl = 1; + + TPG_TFO(tpg)->set_default_node_attributes(acl); + + if (core_create_device_list_for_node(acl) < 0) { + TPG_TFO(tpg)->tpg_release_fabric_acl(tpg, acl); + return NULL; + } + + if (core_set_queue_depth_for_node(tpg, acl) < 0) { + core_free_device_list_for_node(acl, tpg); + TPG_TFO(tpg)->tpg_release_fabric_acl(tpg, acl); + return NULL; + } + + core_tpg_add_node_to_devs(acl, tpg); + + spin_lock_bh(&tpg->acl_node_lock); + list_add_tail(&acl->acl_list, &tpg->acl_node_list); + tpg->num_node_acls++; + spin_unlock_bh(&tpg->acl_node_lock); + + printk("%s_TPG[%u] - Added DYNAMIC ACL with TCQ Depth: %d for %s" + " Initiator Node: %s\n", TPG_TFO(tpg)->get_fabric_name(), + TPG_TFO(tpg)->tpg_get_tag(tpg), acl->queue_depth, + TPG_TFO(tpg)->get_fabric_name(), initiatorname); + + return acl; +} +EXPORT_SYMBOL(core_tpg_check_initiator_node_acl); + +void core_tpg_wait_for_nacl_pr_ref(struct se_node_acl *nacl) +{ + while (atomic_read(&nacl->acl_pr_ref_count) != 0) + cpu_relax(); +} + +void core_tpg_wait_for_mib_ref(struct se_node_acl *nacl) +{ + while (atomic_read(&nacl->mib_ref_count) != 0) + cpu_relax(); +} + +void core_tpg_clear_object_luns(struct se_portal_group *tpg) +{ + int i, ret; + struct se_lun *lun; + + spin_lock(&tpg->tpg_lun_lock); + for (i = 0; i < TRANSPORT_MAX_LUNS_PER_TPG; i++) { + lun = &tpg->tpg_lun_list[i]; + + if ((lun->lun_status != TRANSPORT_LUN_STATUS_ACTIVE) || + (lun->lun_se_dev == NULL)) + continue; + + spin_unlock(&tpg->tpg_lun_lock); + ret = core_dev_del_lun(tpg, lun->unpacked_lun); + spin_lock(&tpg->tpg_lun_lock); + } + spin_unlock(&tpg->tpg_lun_lock); +} +EXPORT_SYMBOL(core_tpg_clear_object_luns); + +/* core_tpg_add_initiator_node_acl(): + * + * + */ +struct se_node_acl *core_tpg_add_initiator_node_acl( + struct se_portal_group *tpg, + struct se_node_acl *se_nacl, + const char *initiatorname, + u32 queue_depth) +{ + struct se_node_acl *acl = NULL; + + spin_lock_bh(&tpg->acl_node_lock); + acl = __core_tpg_get_initiator_node_acl(tpg, initiatorname); + if ((acl)) { + if (acl->dynamic_node_acl) { + acl->dynamic_node_acl = 0; + printk(KERN_INFO "%s_TPG[%u] - Replacing dynamic ACL" + " for %s\n", TPG_TFO(tpg)->get_fabric_name(), + TPG_TFO(tpg)->tpg_get_tag(tpg), initiatorname); + spin_unlock_bh(&tpg->acl_node_lock); + /* + * Release the locally allocated struct se_node_acl + * because * core_tpg_add_initiator_node_acl() returned + * a pointer to an existing demo mode node ACL. + */ + if (se_nacl) + TPG_TFO(tpg)->tpg_release_fabric_acl(tpg, + se_nacl); + goto done; + } + + printk(KERN_ERR "ACL entry for %s Initiator" + " Node %s already exists for TPG %u, ignoring" + " request.\n", TPG_TFO(tpg)->get_fabric_name(), + initiatorname, TPG_TFO(tpg)->tpg_get_tag(tpg)); + spin_unlock_bh(&tpg->acl_node_lock); + return ERR_PTR(-EEXIST); + } + spin_unlock_bh(&tpg->acl_node_lock); + + if (!(se_nacl)) { + printk("struct se_node_acl pointer is NULL\n"); + return ERR_PTR(-EINVAL); + } + /* + * For v4.x logic the se_node_acl_s is hanging off a fabric + * dependent structure allocated via + * struct target_core_fabric_ops->fabric_make_nodeacl() + */ + acl = se_nacl; + + INIT_LIST_HEAD(&acl->acl_list); + INIT_LIST_HEAD(&acl->acl_sess_list); + spin_lock_init(&acl->device_list_lock); + spin_lock_init(&acl->nacl_sess_lock); + atomic_set(&acl->acl_pr_ref_count, 0); + acl->queue_depth = queue_depth; + snprintf(acl->initiatorname, TRANSPORT_IQN_LEN, "%s", initiatorname); + acl->se_tpg = tpg; + acl->acl_index = scsi_get_new_index(SCSI_AUTH_INTR_INDEX); + spin_lock_init(&acl->stats_lock); + + TPG_TFO(tpg)->set_default_node_attributes(acl); + + if (core_create_device_list_for_node(acl) < 0) { + TPG_TFO(tpg)->tpg_release_fabric_acl(tpg, acl); + return ERR_PTR(-ENOMEM); + } + + if (core_set_queue_depth_for_node(tpg, acl) < 0) { + core_free_device_list_for_node(acl, tpg); + TPG_TFO(tpg)->tpg_release_fabric_acl(tpg, acl); + return ERR_PTR(-EINVAL); + } + + spin_lock_bh(&tpg->acl_node_lock); + list_add_tail(&acl->acl_list, &tpg->acl_node_list); + tpg->num_node_acls++; + spin_unlock_bh(&tpg->acl_node_lock); + +done: + printk(KERN_INFO "%s_TPG[%hu] - Added ACL with TCQ Depth: %d for %s" + " Initiator Node: %s\n", TPG_TFO(tpg)->get_fabric_name(), + TPG_TFO(tpg)->tpg_get_tag(tpg), acl->queue_depth, + TPG_TFO(tpg)->get_fabric_name(), initiatorname); + + return acl; +} +EXPORT_SYMBOL(core_tpg_add_initiator_node_acl); + +/* core_tpg_del_initiator_node_acl(): + * + * + */ +int core_tpg_del_initiator_node_acl( + struct se_portal_group *tpg, + struct se_node_acl *acl, + int force) +{ + struct se_session *sess, *sess_tmp; + int dynamic_acl = 0; + + spin_lock_bh(&tpg->acl_node_lock); + if (acl->dynamic_node_acl) { + acl->dynamic_node_acl = 0; + dynamic_acl = 1; + } + list_del(&acl->acl_list); + tpg->num_node_acls--; + spin_unlock_bh(&tpg->acl_node_lock); + + spin_lock_bh(&tpg->session_lock); + list_for_each_entry_safe(sess, sess_tmp, + &tpg->tpg_sess_list, sess_list) { + if (sess->se_node_acl != acl) + continue; + /* + * Determine if the session needs to be closed by our context. + */ + if (!(TPG_TFO(tpg)->shutdown_session(sess))) + continue; + + spin_unlock_bh(&tpg->session_lock); + /* + * If the $FABRIC_MOD session for the Initiator Node ACL exists, + * forcefully shutdown the $FABRIC_MOD session/nexus. + */ + TPG_TFO(tpg)->close_session(sess); + + spin_lock_bh(&tpg->session_lock); + } + spin_unlock_bh(&tpg->session_lock); + + core_tpg_wait_for_nacl_pr_ref(acl); + core_tpg_wait_for_mib_ref(acl); + core_clear_initiator_node_from_tpg(acl, tpg); + core_free_device_list_for_node(acl, tpg); + + printk(KERN_INFO "%s_TPG[%hu] - Deleted ACL with TCQ Depth: %d for %s" + " Initiator Node: %s\n", TPG_TFO(tpg)->get_fabric_name(), + TPG_TFO(tpg)->tpg_get_tag(tpg), acl->queue_depth, + TPG_TFO(tpg)->get_fabric_name(), acl->initiatorname); + + return 0; +} +EXPORT_SYMBOL(core_tpg_del_initiator_node_acl); + +/* core_tpg_set_initiator_node_queue_depth(): + * + * + */ +int core_tpg_set_initiator_node_queue_depth( + struct se_portal_group *tpg, + unsigned char *initiatorname, + u32 queue_depth, + int force) +{ + struct se_session *sess, *init_sess = NULL; + struct se_node_acl *acl; + int dynamic_acl = 0; + + spin_lock_bh(&tpg->acl_node_lock); + acl = __core_tpg_get_initiator_node_acl(tpg, initiatorname); + if (!(acl)) { + printk(KERN_ERR "Access Control List entry for %s Initiator" + " Node %s does not exists for TPG %hu, ignoring" + " request.\n", TPG_TFO(tpg)->get_fabric_name(), + initiatorname, TPG_TFO(tpg)->tpg_get_tag(tpg)); + spin_unlock_bh(&tpg->acl_node_lock); + return -ENODEV; + } + if (acl->dynamic_node_acl) { + acl->dynamic_node_acl = 0; + dynamic_acl = 1; + } + spin_unlock_bh(&tpg->acl_node_lock); + + spin_lock_bh(&tpg->session_lock); + list_for_each_entry(sess, &tpg->tpg_sess_list, sess_list) { + if (sess->se_node_acl != acl) + continue; + + if (!force) { + printk(KERN_ERR "Unable to change queue depth for %s" + " Initiator Node: %s while session is" + " operational. To forcefully change the queue" + " depth and force session reinstatement" + " use the \"force=1\" parameter.\n", + TPG_TFO(tpg)->get_fabric_name(), initiatorname); + spin_unlock_bh(&tpg->session_lock); + + spin_lock_bh(&tpg->acl_node_lock); + if (dynamic_acl) + acl->dynamic_node_acl = 1; + spin_unlock_bh(&tpg->acl_node_lock); + return -EEXIST; + } + /* + * Determine if the session needs to be closed by our context. + */ + if (!(TPG_TFO(tpg)->shutdown_session(sess))) + continue; + + init_sess = sess; + break; + } + + /* + * User has requested to change the queue depth for a Initiator Node. + * Change the value in the Node's struct se_node_acl, and call + * core_set_queue_depth_for_node() to add the requested queue depth. + * + * Finally call TPG_TFO(tpg)->close_session() to force session + * reinstatement to occur if there is an active session for the + * $FABRIC_MOD Initiator Node in question. + */ + acl->queue_depth = queue_depth; + + if (core_set_queue_depth_for_node(tpg, acl) < 0) { + spin_unlock_bh(&tpg->session_lock); + /* + * Force session reinstatement if + * core_set_queue_depth_for_node() failed, because we assume + * the $FABRIC_MOD has already the set session reinstatement + * bit from TPG_TFO(tpg)->shutdown_session() called above. + */ + if (init_sess) + TPG_TFO(tpg)->close_session(init_sess); + + spin_lock_bh(&tpg->acl_node_lock); + if (dynamic_acl) + acl->dynamic_node_acl = 1; + spin_unlock_bh(&tpg->acl_node_lock); + return -EINVAL; + } + spin_unlock_bh(&tpg->session_lock); + /* + * If the $FABRIC_MOD session for the Initiator Node ACL exists, + * forcefully shutdown the $FABRIC_MOD session/nexus. + */ + if (init_sess) + TPG_TFO(tpg)->close_session(init_sess); + + printk(KERN_INFO "Successfuly changed queue depth to: %d for Initiator" + " Node: %s on %s Target Portal Group: %u\n", queue_depth, + initiatorname, TPG_TFO(tpg)->get_fabric_name(), + TPG_TFO(tpg)->tpg_get_tag(tpg)); + + spin_lock_bh(&tpg->acl_node_lock); + if (dynamic_acl) + acl->dynamic_node_acl = 1; + spin_unlock_bh(&tpg->acl_node_lock); + + return 0; +} +EXPORT_SYMBOL(core_tpg_set_initiator_node_queue_depth); + +static int core_tpg_setup_virtual_lun0(struct se_portal_group *se_tpg) +{ + /* Set in core_dev_setup_virtual_lun0() */ + struct se_device *dev = se_global->g_lun0_dev; + struct se_lun *lun = &se_tpg->tpg_virt_lun0; + u32 lun_access = TRANSPORT_LUNFLAGS_READ_ONLY; + int ret; + + lun->unpacked_lun = 0; + lun->lun_status = TRANSPORT_LUN_STATUS_FREE; + atomic_set(&lun->lun_acl_count, 0); + init_completion(&lun->lun_shutdown_comp); + INIT_LIST_HEAD(&lun->lun_acl_list); + INIT_LIST_HEAD(&lun->lun_cmd_list); + spin_lock_init(&lun->lun_acl_lock); + spin_lock_init(&lun->lun_cmd_lock); + spin_lock_init(&lun->lun_sep_lock); + + ret = core_tpg_post_addlun(se_tpg, lun, lun_access, dev); + if (ret < 0) + return -1; + + return 0; +} + +static void core_tpg_release_virtual_lun0(struct se_portal_group *se_tpg) +{ + struct se_lun *lun = &se_tpg->tpg_virt_lun0; + + core_tpg_post_dellun(se_tpg, lun); +} + +int core_tpg_register( + struct target_core_fabric_ops *tfo, + struct se_wwn *se_wwn, + struct se_portal_group *se_tpg, + void *tpg_fabric_ptr, + int se_tpg_type) +{ + struct se_lun *lun; + u32 i; + + se_tpg->tpg_lun_list = kzalloc((sizeof(struct se_lun) * + TRANSPORT_MAX_LUNS_PER_TPG), GFP_KERNEL); + if (!(se_tpg->tpg_lun_list)) { + printk(KERN_ERR "Unable to allocate struct se_portal_group->" + "tpg_lun_list\n"); + return -ENOMEM; + } + + for (i = 0; i < TRANSPORT_MAX_LUNS_PER_TPG; i++) { + lun = &se_tpg->tpg_lun_list[i]; + lun->unpacked_lun = i; + lun->lun_status = TRANSPORT_LUN_STATUS_FREE; + atomic_set(&lun->lun_acl_count, 0); + init_completion(&lun->lun_shutdown_comp); + INIT_LIST_HEAD(&lun->lun_acl_list); + INIT_LIST_HEAD(&lun->lun_cmd_list); + spin_lock_init(&lun->lun_acl_lock); + spin_lock_init(&lun->lun_cmd_lock); + spin_lock_init(&lun->lun_sep_lock); + } + + se_tpg->se_tpg_type = se_tpg_type; + se_tpg->se_tpg_fabric_ptr = tpg_fabric_ptr; + se_tpg->se_tpg_tfo = tfo; + se_tpg->se_tpg_wwn = se_wwn; + atomic_set(&se_tpg->tpg_pr_ref_count, 0); + INIT_LIST_HEAD(&se_tpg->acl_node_list); + INIT_LIST_HEAD(&se_tpg->se_tpg_list); + INIT_LIST_HEAD(&se_tpg->tpg_sess_list); + spin_lock_init(&se_tpg->acl_node_lock); + spin_lock_init(&se_tpg->session_lock); + spin_lock_init(&se_tpg->tpg_lun_lock); + + if (se_tpg->se_tpg_type == TRANSPORT_TPG_TYPE_NORMAL) { + if (core_tpg_setup_virtual_lun0(se_tpg) < 0) { + kfree(se_tpg); + return -ENOMEM; + } + } + + spin_lock_bh(&se_global->se_tpg_lock); + list_add_tail(&se_tpg->se_tpg_list, &se_global->g_se_tpg_list); + spin_unlock_bh(&se_global->se_tpg_lock); + + printk(KERN_INFO "TARGET_CORE[%s]: Allocated %s struct se_portal_group for" + " endpoint: %s, Portal Tag: %u\n", tfo->get_fabric_name(), + (se_tpg->se_tpg_type == TRANSPORT_TPG_TYPE_NORMAL) ? + "Normal" : "Discovery", (tfo->tpg_get_wwn(se_tpg) == NULL) ? + "None" : tfo->tpg_get_wwn(se_tpg), tfo->tpg_get_tag(se_tpg)); + + return 0; +} +EXPORT_SYMBOL(core_tpg_register); + +int core_tpg_deregister(struct se_portal_group *se_tpg) +{ + printk(KERN_INFO "TARGET_CORE[%s]: Deallocating %s struct se_portal_group" + " for endpoint: %s Portal Tag %u\n", + (se_tpg->se_tpg_type == TRANSPORT_TPG_TYPE_NORMAL) ? + "Normal" : "Discovery", TPG_TFO(se_tpg)->get_fabric_name(), + TPG_TFO(se_tpg)->tpg_get_wwn(se_tpg), + TPG_TFO(se_tpg)->tpg_get_tag(se_tpg)); + + spin_lock_bh(&se_global->se_tpg_lock); + list_del(&se_tpg->se_tpg_list); + spin_unlock_bh(&se_global->se_tpg_lock); + + while (atomic_read(&se_tpg->tpg_pr_ref_count) != 0) + cpu_relax(); + + if (se_tpg->se_tpg_type == TRANSPORT_TPG_TYPE_NORMAL) + core_tpg_release_virtual_lun0(se_tpg); + + se_tpg->se_tpg_fabric_ptr = NULL; + kfree(se_tpg->tpg_lun_list); + return 0; +} +EXPORT_SYMBOL(core_tpg_deregister); + +struct se_lun *core_tpg_pre_addlun( + struct se_portal_group *tpg, + u32 unpacked_lun) +{ + struct se_lun *lun; + + if (unpacked_lun > (TRANSPORT_MAX_LUNS_PER_TPG-1)) { + printk(KERN_ERR "%s LUN: %u exceeds TRANSPORT_MAX_LUNS_PER_TPG" + "-1: %u for Target Portal Group: %u\n", + TPG_TFO(tpg)->get_fabric_name(), + unpacked_lun, TRANSPORT_MAX_LUNS_PER_TPG-1, + TPG_TFO(tpg)->tpg_get_tag(tpg)); + return ERR_PTR(-EOVERFLOW); + } + + spin_lock(&tpg->tpg_lun_lock); + lun = &tpg->tpg_lun_list[unpacked_lun]; + if (lun->lun_status == TRANSPORT_LUN_STATUS_ACTIVE) { + printk(KERN_ERR "TPG Logical Unit Number: %u is already active" + " on %s Target Portal Group: %u, ignoring request.\n", + unpacked_lun, TPG_TFO(tpg)->get_fabric_name(), + TPG_TFO(tpg)->tpg_get_tag(tpg)); + spin_unlock(&tpg->tpg_lun_lock); + return ERR_PTR(-EINVAL); + } + spin_unlock(&tpg->tpg_lun_lock); + + return lun; +} + +int core_tpg_post_addlun( + struct se_portal_group *tpg, + struct se_lun *lun, + u32 lun_access, + void *lun_ptr) +{ + if (core_dev_export(lun_ptr, tpg, lun) < 0) + return -1; + + spin_lock(&tpg->tpg_lun_lock); + lun->lun_access = lun_access; + lun->lun_status = TRANSPORT_LUN_STATUS_ACTIVE; + spin_unlock(&tpg->tpg_lun_lock); + + return 0; +} + +static void core_tpg_shutdown_lun( + struct se_portal_group *tpg, + struct se_lun *lun) +{ + core_clear_lun_from_tpg(lun, tpg); + transport_clear_lun_from_sessions(lun); +} + +struct se_lun *core_tpg_pre_dellun( + struct se_portal_group *tpg, + u32 unpacked_lun, + int *ret) +{ + struct se_lun *lun; + + if (unpacked_lun > (TRANSPORT_MAX_LUNS_PER_TPG-1)) { + printk(KERN_ERR "%s LUN: %u exceeds TRANSPORT_MAX_LUNS_PER_TPG" + "-1: %u for Target Portal Group: %u\n", + TPG_TFO(tpg)->get_fabric_name(), unpacked_lun, + TRANSPORT_MAX_LUNS_PER_TPG-1, + TPG_TFO(tpg)->tpg_get_tag(tpg)); + return ERR_PTR(-EOVERFLOW); + } + + spin_lock(&tpg->tpg_lun_lock); + lun = &tpg->tpg_lun_list[unpacked_lun]; + if (lun->lun_status != TRANSPORT_LUN_STATUS_ACTIVE) { + printk(KERN_ERR "%s Logical Unit Number: %u is not active on" + " Target Portal Group: %u, ignoring request.\n", + TPG_TFO(tpg)->get_fabric_name(), unpacked_lun, + TPG_TFO(tpg)->tpg_get_tag(tpg)); + spin_unlock(&tpg->tpg_lun_lock); + return ERR_PTR(-ENODEV); + } + spin_unlock(&tpg->tpg_lun_lock); + + return lun; +} + +int core_tpg_post_dellun( + struct se_portal_group *tpg, + struct se_lun *lun) +{ + core_tpg_shutdown_lun(tpg, lun); + + core_dev_unexport(lun->lun_se_dev, tpg, lun); + + spin_lock(&tpg->tpg_lun_lock); + lun->lun_status = TRANSPORT_LUN_STATUS_FREE; + spin_unlock(&tpg->tpg_lun_lock); + + return 0; +} diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c new file mode 100644 index 000000000000..28b6292ff298 --- /dev/null +++ b/drivers/target/target_core_transport.c @@ -0,0 +1,6134 @@ +/******************************************************************************* + * Filename: target_core_transport.c + * + * This file contains the Generic Target Engine Core. + * + * Copyright (c) 2002, 2003, 2004, 2005 PyX Technologies, Inc. + * Copyright (c) 2005, 2006, 2007 SBE, Inc. + * Copyright (c) 2007-2010 Rising Tide Systems + * Copyright (c) 2008-2010 Linux-iSCSI.org + * + * Nicholas A. Bellinger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + ******************************************************************************/ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include /* For TASK_ATTR_* */ + +#include +#include +#include +#include +#include +#include +#include + +#include "target_core_alua.h" +#include "target_core_hba.h" +#include "target_core_pr.h" +#include "target_core_scdb.h" +#include "target_core_ua.h" + +/* #define DEBUG_CDB_HANDLER */ +#ifdef DEBUG_CDB_HANDLER +#define DEBUG_CDB_H(x...) printk(KERN_INFO x) +#else +#define DEBUG_CDB_H(x...) +#endif + +/* #define DEBUG_CMD_MAP */ +#ifdef DEBUG_CMD_MAP +#define DEBUG_CMD_M(x...) printk(KERN_INFO x) +#else +#define DEBUG_CMD_M(x...) +#endif + +/* #define DEBUG_MEM_ALLOC */ +#ifdef DEBUG_MEM_ALLOC +#define DEBUG_MEM(x...) printk(KERN_INFO x) +#else +#define DEBUG_MEM(x...) +#endif + +/* #define DEBUG_MEM2_ALLOC */ +#ifdef DEBUG_MEM2_ALLOC +#define DEBUG_MEM2(x...) printk(KERN_INFO x) +#else +#define DEBUG_MEM2(x...) +#endif + +/* #define DEBUG_SG_CALC */ +#ifdef DEBUG_SG_CALC +#define DEBUG_SC(x...) printk(KERN_INFO x) +#else +#define DEBUG_SC(x...) +#endif + +/* #define DEBUG_SE_OBJ */ +#ifdef DEBUG_SE_OBJ +#define DEBUG_SO(x...) printk(KERN_INFO x) +#else +#define DEBUG_SO(x...) +#endif + +/* #define DEBUG_CMD_VOL */ +#ifdef DEBUG_CMD_VOL +#define DEBUG_VOL(x...) printk(KERN_INFO x) +#else +#define DEBUG_VOL(x...) +#endif + +/* #define DEBUG_CMD_STOP */ +#ifdef DEBUG_CMD_STOP +#define DEBUG_CS(x...) printk(KERN_INFO x) +#else +#define DEBUG_CS(x...) +#endif + +/* #define DEBUG_PASSTHROUGH */ +#ifdef DEBUG_PASSTHROUGH +#define DEBUG_PT(x...) printk(KERN_INFO x) +#else +#define DEBUG_PT(x...) +#endif + +/* #define DEBUG_TASK_STOP */ +#ifdef DEBUG_TASK_STOP +#define DEBUG_TS(x...) printk(KERN_INFO x) +#else +#define DEBUG_TS(x...) +#endif + +/* #define DEBUG_TRANSPORT_STOP */ +#ifdef DEBUG_TRANSPORT_STOP +#define DEBUG_TRANSPORT_S(x...) printk(KERN_INFO x) +#else +#define DEBUG_TRANSPORT_S(x...) +#endif + +/* #define DEBUG_TASK_FAILURE */ +#ifdef DEBUG_TASK_FAILURE +#define DEBUG_TF(x...) printk(KERN_INFO x) +#else +#define DEBUG_TF(x...) +#endif + +/* #define DEBUG_DEV_OFFLINE */ +#ifdef DEBUG_DEV_OFFLINE +#define DEBUG_DO(x...) printk(KERN_INFO x) +#else +#define DEBUG_DO(x...) +#endif + +/* #define DEBUG_TASK_STATE */ +#ifdef DEBUG_TASK_STATE +#define DEBUG_TSTATE(x...) printk(KERN_INFO x) +#else +#define DEBUG_TSTATE(x...) +#endif + +/* #define DEBUG_STATUS_THR */ +#ifdef DEBUG_STATUS_THR +#define DEBUG_ST(x...) printk(KERN_INFO x) +#else +#define DEBUG_ST(x...) +#endif + +/* #define DEBUG_TASK_TIMEOUT */ +#ifdef DEBUG_TASK_TIMEOUT +#define DEBUG_TT(x...) printk(KERN_INFO x) +#else +#define DEBUG_TT(x...) +#endif + +/* #define DEBUG_GENERIC_REQUEST_FAILURE */ +#ifdef DEBUG_GENERIC_REQUEST_FAILURE +#define DEBUG_GRF(x...) printk(KERN_INFO x) +#else +#define DEBUG_GRF(x...) +#endif + +/* #define DEBUG_SAM_TASK_ATTRS */ +#ifdef DEBUG_SAM_TASK_ATTRS +#define DEBUG_STA(x...) printk(KERN_INFO x) +#else +#define DEBUG_STA(x...) +#endif + +struct se_global *se_global; + +static struct kmem_cache *se_cmd_cache; +static struct kmem_cache *se_sess_cache; +struct kmem_cache *se_tmr_req_cache; +struct kmem_cache *se_ua_cache; +struct kmem_cache *se_mem_cache; +struct kmem_cache *t10_pr_reg_cache; +struct kmem_cache *t10_alua_lu_gp_cache; +struct kmem_cache *t10_alua_lu_gp_mem_cache; +struct kmem_cache *t10_alua_tg_pt_gp_cache; +struct kmem_cache *t10_alua_tg_pt_gp_mem_cache; + +/* Used for transport_dev_get_map_*() */ +typedef int (*map_func_t)(struct se_task *, u32); + +static int transport_generic_write_pending(struct se_cmd *); +static int transport_processing_thread(void *); +static int __transport_execute_tasks(struct se_device *dev); +static void transport_complete_task_attr(struct se_cmd *cmd); +static void transport_direct_request_timeout(struct se_cmd *cmd); +static void transport_free_dev_tasks(struct se_cmd *cmd); +static u32 transport_generic_get_cdb_count(struct se_cmd *cmd, + unsigned long long starting_lba, u32 sectors, + enum dma_data_direction data_direction, + struct list_head *mem_list, int set_counts); +static int transport_generic_get_mem(struct se_cmd *cmd, u32 length, + u32 dma_size); +static int transport_generic_remove(struct se_cmd *cmd, + int release_to_pool, int session_reinstatement); +static int transport_get_sectors(struct se_cmd *cmd); +static struct list_head *transport_init_se_mem_list(void); +static int transport_map_sg_to_mem(struct se_cmd *cmd, + struct list_head *se_mem_list, void *in_mem, + u32 *se_mem_cnt); +static void transport_memcpy_se_mem_read_contig(struct se_cmd *cmd, + unsigned char *dst, struct list_head *se_mem_list); +static void transport_release_fe_cmd(struct se_cmd *cmd); +static void transport_remove_cmd_from_queue(struct se_cmd *cmd, + struct se_queue_obj *qobj); +static int transport_set_sense_codes(struct se_cmd *cmd, u8 asc, u8 ascq); +static void transport_stop_all_task_timers(struct se_cmd *cmd); + +int transport_emulate_control_cdb(struct se_task *task); + +int init_se_global(void) +{ + struct se_global *global; + + global = kzalloc(sizeof(struct se_global), GFP_KERNEL); + if (!(global)) { + printk(KERN_ERR "Unable to allocate memory for struct se_global\n"); + return -1; + } + + INIT_LIST_HEAD(&global->g_lu_gps_list); + INIT_LIST_HEAD(&global->g_se_tpg_list); + INIT_LIST_HEAD(&global->g_hba_list); + INIT_LIST_HEAD(&global->g_se_dev_list); + spin_lock_init(&global->g_device_lock); + spin_lock_init(&global->hba_lock); + spin_lock_init(&global->se_tpg_lock); + spin_lock_init(&global->lu_gps_lock); + spin_lock_init(&global->plugin_class_lock); + + se_cmd_cache = kmem_cache_create("se_cmd_cache", + sizeof(struct se_cmd), __alignof__(struct se_cmd), 0, NULL); + if (!(se_cmd_cache)) { + printk(KERN_ERR "kmem_cache_create for struct se_cmd failed\n"); + goto out; + } + se_tmr_req_cache = kmem_cache_create("se_tmr_cache", + sizeof(struct se_tmr_req), __alignof__(struct se_tmr_req), + 0, NULL); + if (!(se_tmr_req_cache)) { + printk(KERN_ERR "kmem_cache_create() for struct se_tmr_req" + " failed\n"); + goto out; + } + se_sess_cache = kmem_cache_create("se_sess_cache", + sizeof(struct se_session), __alignof__(struct se_session), + 0, NULL); + if (!(se_sess_cache)) { + printk(KERN_ERR "kmem_cache_create() for struct se_session" + " failed\n"); + goto out; + } + se_ua_cache = kmem_cache_create("se_ua_cache", + sizeof(struct se_ua), __alignof__(struct se_ua), + 0, NULL); + if (!(se_ua_cache)) { + printk(KERN_ERR "kmem_cache_create() for struct se_ua failed\n"); + goto out; + } + se_mem_cache = kmem_cache_create("se_mem_cache", + sizeof(struct se_mem), __alignof__(struct se_mem), 0, NULL); + if (!(se_mem_cache)) { + printk(KERN_ERR "kmem_cache_create() for struct se_mem failed\n"); + goto out; + } + t10_pr_reg_cache = kmem_cache_create("t10_pr_reg_cache", + sizeof(struct t10_pr_registration), + __alignof__(struct t10_pr_registration), 0, NULL); + if (!(t10_pr_reg_cache)) { + printk(KERN_ERR "kmem_cache_create() for struct t10_pr_registration" + " failed\n"); + goto out; + } + t10_alua_lu_gp_cache = kmem_cache_create("t10_alua_lu_gp_cache", + sizeof(struct t10_alua_lu_gp), __alignof__(struct t10_alua_lu_gp), + 0, NULL); + if (!(t10_alua_lu_gp_cache)) { + printk(KERN_ERR "kmem_cache_create() for t10_alua_lu_gp_cache" + " failed\n"); + goto out; + } + t10_alua_lu_gp_mem_cache = kmem_cache_create("t10_alua_lu_gp_mem_cache", + sizeof(struct t10_alua_lu_gp_member), + __alignof__(struct t10_alua_lu_gp_member), 0, NULL); + if (!(t10_alua_lu_gp_mem_cache)) { + printk(KERN_ERR "kmem_cache_create() for t10_alua_lu_gp_mem_" + "cache failed\n"); + goto out; + } + t10_alua_tg_pt_gp_cache = kmem_cache_create("t10_alua_tg_pt_gp_cache", + sizeof(struct t10_alua_tg_pt_gp), + __alignof__(struct t10_alua_tg_pt_gp), 0, NULL); + if (!(t10_alua_tg_pt_gp_cache)) { + printk(KERN_ERR "kmem_cache_create() for t10_alua_tg_pt_gp_" + "cache failed\n"); + goto out; + } + t10_alua_tg_pt_gp_mem_cache = kmem_cache_create( + "t10_alua_tg_pt_gp_mem_cache", + sizeof(struct t10_alua_tg_pt_gp_member), + __alignof__(struct t10_alua_tg_pt_gp_member), + 0, NULL); + if (!(t10_alua_tg_pt_gp_mem_cache)) { + printk(KERN_ERR "kmem_cache_create() for t10_alua_tg_pt_gp_" + "mem_t failed\n"); + goto out; + } + + se_global = global; + + return 0; +out: + if (se_cmd_cache) + kmem_cache_destroy(se_cmd_cache); + if (se_tmr_req_cache) + kmem_cache_destroy(se_tmr_req_cache); + if (se_sess_cache) + kmem_cache_destroy(se_sess_cache); + if (se_ua_cache) + kmem_cache_destroy(se_ua_cache); + if (se_mem_cache) + kmem_cache_destroy(se_mem_cache); + if (t10_pr_reg_cache) + kmem_cache_destroy(t10_pr_reg_cache); + if (t10_alua_lu_gp_cache) + kmem_cache_destroy(t10_alua_lu_gp_cache); + if (t10_alua_lu_gp_mem_cache) + kmem_cache_destroy(t10_alua_lu_gp_mem_cache); + if (t10_alua_tg_pt_gp_cache) + kmem_cache_destroy(t10_alua_tg_pt_gp_cache); + if (t10_alua_tg_pt_gp_mem_cache) + kmem_cache_destroy(t10_alua_tg_pt_gp_mem_cache); + kfree(global); + return -1; +} + +void release_se_global(void) +{ + struct se_global *global; + + global = se_global; + if (!(global)) + return; + + kmem_cache_destroy(se_cmd_cache); + kmem_cache_destroy(se_tmr_req_cache); + kmem_cache_destroy(se_sess_cache); + kmem_cache_destroy(se_ua_cache); + kmem_cache_destroy(se_mem_cache); + kmem_cache_destroy(t10_pr_reg_cache); + kmem_cache_destroy(t10_alua_lu_gp_cache); + kmem_cache_destroy(t10_alua_lu_gp_mem_cache); + kmem_cache_destroy(t10_alua_tg_pt_gp_cache); + kmem_cache_destroy(t10_alua_tg_pt_gp_mem_cache); + kfree(global); + + se_global = NULL; +} + +void transport_init_queue_obj(struct se_queue_obj *qobj) +{ + atomic_set(&qobj->queue_cnt, 0); + INIT_LIST_HEAD(&qobj->qobj_list); + init_waitqueue_head(&qobj->thread_wq); + spin_lock_init(&qobj->cmd_queue_lock); +} +EXPORT_SYMBOL(transport_init_queue_obj); + +static int transport_subsystem_reqmods(void) +{ + int ret; + + ret = request_module("target_core_iblock"); + if (ret != 0) + printk(KERN_ERR "Unable to load target_core_iblock\n"); + + ret = request_module("target_core_file"); + if (ret != 0) + printk(KERN_ERR "Unable to load target_core_file\n"); + + ret = request_module("target_core_pscsi"); + if (ret != 0) + printk(KERN_ERR "Unable to load target_core_pscsi\n"); + + ret = request_module("target_core_stgt"); + if (ret != 0) + printk(KERN_ERR "Unable to load target_core_stgt\n"); + + return 0; +} + +int transport_subsystem_check_init(void) +{ + if (se_global->g_sub_api_initialized) + return 0; + /* + * Request the loading of known TCM subsystem plugins.. + */ + if (transport_subsystem_reqmods() < 0) + return -1; + + se_global->g_sub_api_initialized = 1; + return 0; +} + +struct se_session *transport_init_session(void) +{ + struct se_session *se_sess; + + se_sess = kmem_cache_zalloc(se_sess_cache, GFP_KERNEL); + if (!(se_sess)) { + printk(KERN_ERR "Unable to allocate struct se_session from" + " se_sess_cache\n"); + return ERR_PTR(-ENOMEM); + } + INIT_LIST_HEAD(&se_sess->sess_list); + INIT_LIST_HEAD(&se_sess->sess_acl_list); + atomic_set(&se_sess->mib_ref_count, 0); + + return se_sess; +} +EXPORT_SYMBOL(transport_init_session); + +/* + * Called with spin_lock_bh(&struct se_portal_group->session_lock called. + */ +void __transport_register_session( + struct se_portal_group *se_tpg, + struct se_node_acl *se_nacl, + struct se_session *se_sess, + void *fabric_sess_ptr) +{ + unsigned char buf[PR_REG_ISID_LEN]; + + se_sess->se_tpg = se_tpg; + se_sess->fabric_sess_ptr = fabric_sess_ptr; + /* + * Used by struct se_node_acl's under ConfigFS to locate active se_session-t + * + * Only set for struct se_session's that will actually be moving I/O. + * eg: *NOT* discovery sessions. + */ + if (se_nacl) { + /* + * If the fabric module supports an ISID based TransportID, + * save this value in binary from the fabric I_T Nexus now. + */ + if (TPG_TFO(se_tpg)->sess_get_initiator_sid != NULL) { + memset(&buf[0], 0, PR_REG_ISID_LEN); + TPG_TFO(se_tpg)->sess_get_initiator_sid(se_sess, + &buf[0], PR_REG_ISID_LEN); + se_sess->sess_bin_isid = get_unaligned_be64(&buf[0]); + } + spin_lock_irq(&se_nacl->nacl_sess_lock); + /* + * The se_nacl->nacl_sess pointer will be set to the + * last active I_T Nexus for each struct se_node_acl. + */ + se_nacl->nacl_sess = se_sess; + + list_add_tail(&se_sess->sess_acl_list, + &se_nacl->acl_sess_list); + spin_unlock_irq(&se_nacl->nacl_sess_lock); + } + list_add_tail(&se_sess->sess_list, &se_tpg->tpg_sess_list); + + printk(KERN_INFO "TARGET_CORE[%s]: Registered fabric_sess_ptr: %p\n", + TPG_TFO(se_tpg)->get_fabric_name(), se_sess->fabric_sess_ptr); +} +EXPORT_SYMBOL(__transport_register_session); + +void transport_register_session( + struct se_portal_group *se_tpg, + struct se_node_acl *se_nacl, + struct se_session *se_sess, + void *fabric_sess_ptr) +{ + spin_lock_bh(&se_tpg->session_lock); + __transport_register_session(se_tpg, se_nacl, se_sess, fabric_sess_ptr); + spin_unlock_bh(&se_tpg->session_lock); +} +EXPORT_SYMBOL(transport_register_session); + +void transport_deregister_session_configfs(struct se_session *se_sess) +{ + struct se_node_acl *se_nacl; + + /* + * Used by struct se_node_acl's under ConfigFS to locate active struct se_session + */ + se_nacl = se_sess->se_node_acl; + if ((se_nacl)) { + spin_lock_irq(&se_nacl->nacl_sess_lock); + list_del(&se_sess->sess_acl_list); + /* + * If the session list is empty, then clear the pointer. + * Otherwise, set the struct se_session pointer from the tail + * element of the per struct se_node_acl active session list. + */ + if (list_empty(&se_nacl->acl_sess_list)) + se_nacl->nacl_sess = NULL; + else { + se_nacl->nacl_sess = container_of( + se_nacl->acl_sess_list.prev, + struct se_session, sess_acl_list); + } + spin_unlock_irq(&se_nacl->nacl_sess_lock); + } +} +EXPORT_SYMBOL(transport_deregister_session_configfs); + +void transport_free_session(struct se_session *se_sess) +{ + kmem_cache_free(se_sess_cache, se_sess); +} +EXPORT_SYMBOL(transport_free_session); + +void transport_deregister_session(struct se_session *se_sess) +{ + struct se_portal_group *se_tpg = se_sess->se_tpg; + struct se_node_acl *se_nacl; + + if (!(se_tpg)) { + transport_free_session(se_sess); + return; + } + /* + * Wait for possible reference in drivers/target/target_core_mib.c: + * scsi_att_intr_port_seq_show() + */ + while (atomic_read(&se_sess->mib_ref_count) != 0) + cpu_relax(); + + spin_lock_bh(&se_tpg->session_lock); + list_del(&se_sess->sess_list); + se_sess->se_tpg = NULL; + se_sess->fabric_sess_ptr = NULL; + spin_unlock_bh(&se_tpg->session_lock); + + /* + * Determine if we need to do extra work for this initiator node's + * struct se_node_acl if it had been previously dynamically generated. + */ + se_nacl = se_sess->se_node_acl; + if ((se_nacl)) { + spin_lock_bh(&se_tpg->acl_node_lock); + if (se_nacl->dynamic_node_acl) { + if (!(TPG_TFO(se_tpg)->tpg_check_demo_mode_cache( + se_tpg))) { + list_del(&se_nacl->acl_list); + se_tpg->num_node_acls--; + spin_unlock_bh(&se_tpg->acl_node_lock); + + core_tpg_wait_for_nacl_pr_ref(se_nacl); + core_tpg_wait_for_mib_ref(se_nacl); + core_free_device_list_for_node(se_nacl, se_tpg); + TPG_TFO(se_tpg)->tpg_release_fabric_acl(se_tpg, + se_nacl); + spin_lock_bh(&se_tpg->acl_node_lock); + } + } + spin_unlock_bh(&se_tpg->acl_node_lock); + } + + transport_free_session(se_sess); + + printk(KERN_INFO "TARGET_CORE[%s]: Deregistered fabric_sess\n", + TPG_TFO(se_tpg)->get_fabric_name()); +} +EXPORT_SYMBOL(transport_deregister_session); + +/* + * Called with T_TASK(cmd)->t_state_lock held. + */ +static void transport_all_task_dev_remove_state(struct se_cmd *cmd) +{ + struct se_device *dev; + struct se_task *task; + unsigned long flags; + + if (!T_TASK(cmd)) + return; + + list_for_each_entry(task, &T_TASK(cmd)->t_task_list, t_list) { + dev = task->se_dev; + if (!(dev)) + continue; + + if (atomic_read(&task->task_active)) + continue; + + if (!(atomic_read(&task->task_state_active))) + continue; + + spin_lock_irqsave(&dev->execute_task_lock, flags); + list_del(&task->t_state_list); + DEBUG_TSTATE("Removed ITT: 0x%08x dev: %p task[%p]\n", + CMD_TFO(cmd)->tfo_get_task_tag(cmd), dev, task); + spin_unlock_irqrestore(&dev->execute_task_lock, flags); + + atomic_set(&task->task_state_active, 0); + atomic_dec(&T_TASK(cmd)->t_task_cdbs_ex_left); + } +} + +/* transport_cmd_check_stop(): + * + * 'transport_off = 1' determines if t_transport_active should be cleared. + * 'transport_off = 2' determines if task_dev_state should be removed. + * + * A non-zero u8 t_state sets cmd->t_state. + * Returns 1 when command is stopped, else 0. + */ +static int transport_cmd_check_stop( + struct se_cmd *cmd, + int transport_off, + u8 t_state) +{ + unsigned long flags; + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + /* + * Determine if IOCTL context caller in requesting the stopping of this + * command for LUN shutdown purposes. + */ + if (atomic_read(&T_TASK(cmd)->transport_lun_stop)) { + DEBUG_CS("%s:%d atomic_read(&T_TASK(cmd)->transport_lun_stop)" + " == TRUE for ITT: 0x%08x\n", __func__, __LINE__, + CMD_TFO(cmd)->get_task_tag(cmd)); + + cmd->deferred_t_state = cmd->t_state; + cmd->t_state = TRANSPORT_DEFERRED_CMD; + atomic_set(&T_TASK(cmd)->t_transport_active, 0); + if (transport_off == 2) + transport_all_task_dev_remove_state(cmd); + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + + complete(&T_TASK(cmd)->transport_lun_stop_comp); + return 1; + } + /* + * Determine if frontend context caller is requesting the stopping of + * this command for frontend excpections. + */ + if (atomic_read(&T_TASK(cmd)->t_transport_stop)) { + DEBUG_CS("%s:%d atomic_read(&T_TASK(cmd)->t_transport_stop) ==" + " TRUE for ITT: 0x%08x\n", __func__, __LINE__, + CMD_TFO(cmd)->get_task_tag(cmd)); + + cmd->deferred_t_state = cmd->t_state; + cmd->t_state = TRANSPORT_DEFERRED_CMD; + if (transport_off == 2) + transport_all_task_dev_remove_state(cmd); + + /* + * Clear struct se_cmd->se_lun before the transport_off == 2 handoff + * to FE. + */ + if (transport_off == 2) + cmd->se_lun = NULL; + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + + complete(&T_TASK(cmd)->t_transport_stop_comp); + return 1; + } + if (transport_off) { + atomic_set(&T_TASK(cmd)->t_transport_active, 0); + if (transport_off == 2) { + transport_all_task_dev_remove_state(cmd); + /* + * Clear struct se_cmd->se_lun before the transport_off == 2 + * handoff to fabric module. + */ + cmd->se_lun = NULL; + /* + * Some fabric modules like tcm_loop can release + * their internally allocated I/O refrence now and + * struct se_cmd now. + */ + if (CMD_TFO(cmd)->check_stop_free != NULL) { + spin_unlock_irqrestore( + &T_TASK(cmd)->t_state_lock, flags); + + CMD_TFO(cmd)->check_stop_free(cmd); + return 1; + } + } + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + + return 0; + } else if (t_state) + cmd->t_state = t_state; + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + + return 0; +} + +static int transport_cmd_check_stop_to_fabric(struct se_cmd *cmd) +{ + return transport_cmd_check_stop(cmd, 2, 0); +} + +static void transport_lun_remove_cmd(struct se_cmd *cmd) +{ + struct se_lun *lun = SE_LUN(cmd); + unsigned long flags; + + if (!lun) + return; + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + if (!(atomic_read(&T_TASK(cmd)->transport_dev_active))) { + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + goto check_lun; + } + atomic_set(&T_TASK(cmd)->transport_dev_active, 0); + transport_all_task_dev_remove_state(cmd); + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + + transport_free_dev_tasks(cmd); + +check_lun: + spin_lock_irqsave(&lun->lun_cmd_lock, flags); + if (atomic_read(&T_TASK(cmd)->transport_lun_active)) { + list_del(&cmd->se_lun_list); + atomic_set(&T_TASK(cmd)->transport_lun_active, 0); +#if 0 + printk(KERN_INFO "Removed ITT: 0x%08x from LUN LIST[%d]\n" + CMD_TFO(cmd)->get_task_tag(cmd), lun->unpacked_lun); +#endif + } + spin_unlock_irqrestore(&lun->lun_cmd_lock, flags); +} + +void transport_cmd_finish_abort(struct se_cmd *cmd, int remove) +{ + transport_remove_cmd_from_queue(cmd, SE_DEV(cmd)->dev_queue_obj); + transport_lun_remove_cmd(cmd); + + if (transport_cmd_check_stop_to_fabric(cmd)) + return; + if (remove) + transport_generic_remove(cmd, 0, 0); +} + +void transport_cmd_finish_abort_tmr(struct se_cmd *cmd) +{ + transport_remove_cmd_from_queue(cmd, SE_DEV(cmd)->dev_queue_obj); + + if (transport_cmd_check_stop_to_fabric(cmd)) + return; + + transport_generic_remove(cmd, 0, 0); +} + +static int transport_add_cmd_to_queue( + struct se_cmd *cmd, + int t_state) +{ + struct se_device *dev = cmd->se_dev; + struct se_queue_obj *qobj = dev->dev_queue_obj; + struct se_queue_req *qr; + unsigned long flags; + + qr = kzalloc(sizeof(struct se_queue_req), GFP_ATOMIC); + if (!(qr)) { + printk(KERN_ERR "Unable to allocate memory for" + " struct se_queue_req\n"); + return -1; + } + INIT_LIST_HEAD(&qr->qr_list); + + qr->cmd = (void *)cmd; + qr->state = t_state; + + if (t_state) { + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + cmd->t_state = t_state; + atomic_set(&T_TASK(cmd)->t_transport_active, 1); + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + } + + spin_lock_irqsave(&qobj->cmd_queue_lock, flags); + list_add_tail(&qr->qr_list, &qobj->qobj_list); + atomic_inc(&T_TASK(cmd)->t_transport_queue_active); + spin_unlock_irqrestore(&qobj->cmd_queue_lock, flags); + + atomic_inc(&qobj->queue_cnt); + wake_up_interruptible(&qobj->thread_wq); + return 0; +} + +/* + * Called with struct se_queue_obj->cmd_queue_lock held. + */ +static struct se_queue_req * +__transport_get_qr_from_queue(struct se_queue_obj *qobj) +{ + struct se_cmd *cmd; + struct se_queue_req *qr = NULL; + + if (list_empty(&qobj->qobj_list)) + return NULL; + + list_for_each_entry(qr, &qobj->qobj_list, qr_list) + break; + + if (qr->cmd) { + cmd = (struct se_cmd *)qr->cmd; + atomic_dec(&T_TASK(cmd)->t_transport_queue_active); + } + list_del(&qr->qr_list); + atomic_dec(&qobj->queue_cnt); + + return qr; +} + +static struct se_queue_req * +transport_get_qr_from_queue(struct se_queue_obj *qobj) +{ + struct se_cmd *cmd; + struct se_queue_req *qr; + unsigned long flags; + + spin_lock_irqsave(&qobj->cmd_queue_lock, flags); + if (list_empty(&qobj->qobj_list)) { + spin_unlock_irqrestore(&qobj->cmd_queue_lock, flags); + return NULL; + } + + list_for_each_entry(qr, &qobj->qobj_list, qr_list) + break; + + if (qr->cmd) { + cmd = (struct se_cmd *)qr->cmd; + atomic_dec(&T_TASK(cmd)->t_transport_queue_active); + } + list_del(&qr->qr_list); + atomic_dec(&qobj->queue_cnt); + spin_unlock_irqrestore(&qobj->cmd_queue_lock, flags); + + return qr; +} + +static void transport_remove_cmd_from_queue(struct se_cmd *cmd, + struct se_queue_obj *qobj) +{ + struct se_cmd *q_cmd; + struct se_queue_req *qr = NULL, *qr_p = NULL; + unsigned long flags; + + spin_lock_irqsave(&qobj->cmd_queue_lock, flags); + if (!(atomic_read(&T_TASK(cmd)->t_transport_queue_active))) { + spin_unlock_irqrestore(&qobj->cmd_queue_lock, flags); + return; + } + + list_for_each_entry_safe(qr, qr_p, &qobj->qobj_list, qr_list) { + q_cmd = (struct se_cmd *)qr->cmd; + if (q_cmd != cmd) + continue; + + atomic_dec(&T_TASK(q_cmd)->t_transport_queue_active); + atomic_dec(&qobj->queue_cnt); + list_del(&qr->qr_list); + kfree(qr); + } + spin_unlock_irqrestore(&qobj->cmd_queue_lock, flags); + + if (atomic_read(&T_TASK(cmd)->t_transport_queue_active)) { + printk(KERN_ERR "ITT: 0x%08x t_transport_queue_active: %d\n", + CMD_TFO(cmd)->get_task_tag(cmd), + atomic_read(&T_TASK(cmd)->t_transport_queue_active)); + } +} + +/* + * Completion function used by TCM subsystem plugins (such as FILEIO) + * for queueing up response from struct se_subsystem_api->do_task() + */ +void transport_complete_sync_cache(struct se_cmd *cmd, int good) +{ + struct se_task *task = list_entry(T_TASK(cmd)->t_task_list.next, + struct se_task, t_list); + + if (good) { + cmd->scsi_status = SAM_STAT_GOOD; + task->task_scsi_status = GOOD; + } else { + task->task_scsi_status = SAM_STAT_CHECK_CONDITION; + task->task_error_status = PYX_TRANSPORT_ILLEGAL_REQUEST; + TASK_CMD(task)->transport_error_status = + PYX_TRANSPORT_ILLEGAL_REQUEST; + } + + transport_complete_task(task, good); +} +EXPORT_SYMBOL(transport_complete_sync_cache); + +/* transport_complete_task(): + * + * Called from interrupt and non interrupt context depending + * on the transport plugin. + */ +void transport_complete_task(struct se_task *task, int success) +{ + struct se_cmd *cmd = TASK_CMD(task); + struct se_device *dev = task->se_dev; + int t_state; + unsigned long flags; +#if 0 + printk(KERN_INFO "task: %p CDB: 0x%02x obj_ptr: %p\n", task, + T_TASK(cmd)->t_task_cdb[0], dev); +#endif + if (dev) { + spin_lock_irqsave(&SE_HBA(dev)->hba_queue_lock, flags); + atomic_inc(&dev->depth_left); + atomic_inc(&SE_HBA(dev)->left_queue_depth); + spin_unlock_irqrestore(&SE_HBA(dev)->hba_queue_lock, flags); + } + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + atomic_set(&task->task_active, 0); + + /* + * See if any sense data exists, if so set the TASK_SENSE flag. + * Also check for any other post completion work that needs to be + * done by the plugins. + */ + if (dev && dev->transport->transport_complete) { + if (dev->transport->transport_complete(task) != 0) { + cmd->se_cmd_flags |= SCF_TRANSPORT_TASK_SENSE; + task->task_sense = 1; + success = 1; + } + } + + /* + * See if we are waiting for outstanding struct se_task + * to complete for an exception condition + */ + if (atomic_read(&task->task_stop)) { + /* + * Decrement T_TASK(cmd)->t_se_count if this task had + * previously thrown its timeout exception handler. + */ + if (atomic_read(&task->task_timeout)) { + atomic_dec(&T_TASK(cmd)->t_se_count); + atomic_set(&task->task_timeout, 0); + } + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + + complete(&task->task_stop_comp); + return; + } + /* + * If the task's timeout handler has fired, use the t_task_cdbs_timeout + * left counter to determine when the struct se_cmd is ready to be queued to + * the processing thread. + */ + if (atomic_read(&task->task_timeout)) { + if (!(atomic_dec_and_test( + &T_TASK(cmd)->t_task_cdbs_timeout_left))) { + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, + flags); + return; + } + t_state = TRANSPORT_COMPLETE_TIMEOUT; + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + + transport_add_cmd_to_queue(cmd, t_state); + return; + } + atomic_dec(&T_TASK(cmd)->t_task_cdbs_timeout_left); + + /* + * Decrement the outstanding t_task_cdbs_left count. The last + * struct se_task from struct se_cmd will complete itself into the + * device queue depending upon int success. + */ + if (!(atomic_dec_and_test(&T_TASK(cmd)->t_task_cdbs_left))) { + if (!success) + T_TASK(cmd)->t_tasks_failed = 1; + + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + return; + } + + if (!success || T_TASK(cmd)->t_tasks_failed) { + t_state = TRANSPORT_COMPLETE_FAILURE; + if (!task->task_error_status) { + task->task_error_status = + PYX_TRANSPORT_UNKNOWN_SAM_OPCODE; + cmd->transport_error_status = + PYX_TRANSPORT_UNKNOWN_SAM_OPCODE; + } + } else { + atomic_set(&T_TASK(cmd)->t_transport_complete, 1); + t_state = TRANSPORT_COMPLETE_OK; + } + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + + transport_add_cmd_to_queue(cmd, t_state); +} +EXPORT_SYMBOL(transport_complete_task); + +/* + * Called by transport_add_tasks_from_cmd() once a struct se_cmd's + * struct se_task list are ready to be added to the active execution list + * struct se_device + + * Called with se_dev_t->execute_task_lock called. + */ +static inline int transport_add_task_check_sam_attr( + struct se_task *task, + struct se_task *task_prev, + struct se_device *dev) +{ + /* + * No SAM Task attribute emulation enabled, add to tail of + * execution queue + */ + if (dev->dev_task_attr_type != SAM_TASK_ATTR_EMULATED) { + list_add_tail(&task->t_execute_list, &dev->execute_task_list); + return 0; + } + /* + * HEAD_OF_QUEUE attribute for received CDB, which means + * the first task that is associated with a struct se_cmd goes to + * head of the struct se_device->execute_task_list, and task_prev + * after that for each subsequent task + */ + if (task->task_se_cmd->sam_task_attr == TASK_ATTR_HOQ) { + list_add(&task->t_execute_list, + (task_prev != NULL) ? + &task_prev->t_execute_list : + &dev->execute_task_list); + + DEBUG_STA("Set HEAD_OF_QUEUE for task CDB: 0x%02x" + " in execution queue\n", + T_TASK(task->task_se_cmd)->t_task_cdb[0]); + return 1; + } + /* + * For ORDERED, SIMPLE or UNTAGGED attribute tasks once they have been + * transitioned from Dermant -> Active state, and are added to the end + * of the struct se_device->execute_task_list + */ + list_add_tail(&task->t_execute_list, &dev->execute_task_list); + return 0; +} + +/* __transport_add_task_to_execute_queue(): + * + * Called with se_dev_t->execute_task_lock called. + */ +static void __transport_add_task_to_execute_queue( + struct se_task *task, + struct se_task *task_prev, + struct se_device *dev) +{ + int head_of_queue; + + head_of_queue = transport_add_task_check_sam_attr(task, task_prev, dev); + atomic_inc(&dev->execute_tasks); + + if (atomic_read(&task->task_state_active)) + return; + /* + * Determine if this task needs to go to HEAD_OF_QUEUE for the + * state list as well. Running with SAM Task Attribute emulation + * will always return head_of_queue == 0 here + */ + if (head_of_queue) + list_add(&task->t_state_list, (task_prev) ? + &task_prev->t_state_list : + &dev->state_task_list); + else + list_add_tail(&task->t_state_list, &dev->state_task_list); + + atomic_set(&task->task_state_active, 1); + + DEBUG_TSTATE("Added ITT: 0x%08x task[%p] to dev: %p\n", + CMD_TFO(task->task_se_cmd)->get_task_tag(task->task_se_cmd), + task, dev); +} + +static void transport_add_tasks_to_state_queue(struct se_cmd *cmd) +{ + struct se_device *dev; + struct se_task *task; + unsigned long flags; + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + list_for_each_entry(task, &T_TASK(cmd)->t_task_list, t_list) { + dev = task->se_dev; + + if (atomic_read(&task->task_state_active)) + continue; + + spin_lock(&dev->execute_task_lock); + list_add_tail(&task->t_state_list, &dev->state_task_list); + atomic_set(&task->task_state_active, 1); + + DEBUG_TSTATE("Added ITT: 0x%08x task[%p] to dev: %p\n", + CMD_TFO(task->task_se_cmd)->get_task_tag( + task->task_se_cmd), task, dev); + + spin_unlock(&dev->execute_task_lock); + } + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); +} + +static void transport_add_tasks_from_cmd(struct se_cmd *cmd) +{ + struct se_device *dev = SE_DEV(cmd); + struct se_task *task, *task_prev = NULL; + unsigned long flags; + + spin_lock_irqsave(&dev->execute_task_lock, flags); + list_for_each_entry(task, &T_TASK(cmd)->t_task_list, t_list) { + if (atomic_read(&task->task_execute_queue)) + continue; + /* + * __transport_add_task_to_execute_queue() handles the + * SAM Task Attribute emulation if enabled + */ + __transport_add_task_to_execute_queue(task, task_prev, dev); + atomic_set(&task->task_execute_queue, 1); + task_prev = task; + } + spin_unlock_irqrestore(&dev->execute_task_lock, flags); + + return; +} + +/* transport_get_task_from_execute_queue(): + * + * Called with dev->execute_task_lock held. + */ +static struct se_task * +transport_get_task_from_execute_queue(struct se_device *dev) +{ + struct se_task *task; + + if (list_empty(&dev->execute_task_list)) + return NULL; + + list_for_each_entry(task, &dev->execute_task_list, t_execute_list) + break; + + list_del(&task->t_execute_list); + atomic_dec(&dev->execute_tasks); + + return task; +} + +/* transport_remove_task_from_execute_queue(): + * + * + */ +static void transport_remove_task_from_execute_queue( + struct se_task *task, + struct se_device *dev) +{ + unsigned long flags; + + spin_lock_irqsave(&dev->execute_task_lock, flags); + list_del(&task->t_execute_list); + atomic_dec(&dev->execute_tasks); + spin_unlock_irqrestore(&dev->execute_task_lock, flags); +} + +unsigned char *transport_dump_cmd_direction(struct se_cmd *cmd) +{ + switch (cmd->data_direction) { + case DMA_NONE: + return "NONE"; + case DMA_FROM_DEVICE: + return "READ"; + case DMA_TO_DEVICE: + return "WRITE"; + case DMA_BIDIRECTIONAL: + return "BIDI"; + default: + break; + } + + return "UNKNOWN"; +} + +void transport_dump_dev_state( + struct se_device *dev, + char *b, + int *bl) +{ + *bl += sprintf(b + *bl, "Status: "); + switch (dev->dev_status) { + case TRANSPORT_DEVICE_ACTIVATED: + *bl += sprintf(b + *bl, "ACTIVATED"); + break; + case TRANSPORT_DEVICE_DEACTIVATED: + *bl += sprintf(b + *bl, "DEACTIVATED"); + break; + case TRANSPORT_DEVICE_SHUTDOWN: + *bl += sprintf(b + *bl, "SHUTDOWN"); + break; + case TRANSPORT_DEVICE_OFFLINE_ACTIVATED: + case TRANSPORT_DEVICE_OFFLINE_DEACTIVATED: + *bl += sprintf(b + *bl, "OFFLINE"); + break; + default: + *bl += sprintf(b + *bl, "UNKNOWN=%d", dev->dev_status); + break; + } + + *bl += sprintf(b + *bl, " Execute/Left/Max Queue Depth: %d/%d/%d", + atomic_read(&dev->execute_tasks), atomic_read(&dev->depth_left), + dev->queue_depth); + *bl += sprintf(b + *bl, " SectorSize: %u MaxSectors: %u\n", + DEV_ATTRIB(dev)->block_size, DEV_ATTRIB(dev)->max_sectors); + *bl += sprintf(b + *bl, " "); +} + +/* transport_release_all_cmds(): + * + * + */ +static void transport_release_all_cmds(struct se_device *dev) +{ + struct se_cmd *cmd = NULL; + struct se_queue_req *qr = NULL, *qr_p = NULL; + int bug_out = 0, t_state; + unsigned long flags; + + spin_lock_irqsave(&dev->dev_queue_obj->cmd_queue_lock, flags); + list_for_each_entry_safe(qr, qr_p, &dev->dev_queue_obj->qobj_list, + qr_list) { + + cmd = (struct se_cmd *)qr->cmd; + t_state = qr->state; + list_del(&qr->qr_list); + kfree(qr); + spin_unlock_irqrestore(&dev->dev_queue_obj->cmd_queue_lock, + flags); + + printk(KERN_ERR "Releasing ITT: 0x%08x, i_state: %u," + " t_state: %u directly\n", + CMD_TFO(cmd)->get_task_tag(cmd), + CMD_TFO(cmd)->get_cmd_state(cmd), t_state); + + transport_release_fe_cmd(cmd); + bug_out = 1; + + spin_lock_irqsave(&dev->dev_queue_obj->cmd_queue_lock, flags); + } + spin_unlock_irqrestore(&dev->dev_queue_obj->cmd_queue_lock, flags); +#if 0 + if (bug_out) + BUG(); +#endif +} + +void transport_dump_vpd_proto_id( + struct t10_vpd *vpd, + unsigned char *p_buf, + int p_buf_len) +{ + unsigned char buf[VPD_TMP_BUF_SIZE]; + int len; + + memset(buf, 0, VPD_TMP_BUF_SIZE); + len = sprintf(buf, "T10 VPD Protocol Identifier: "); + + switch (vpd->protocol_identifier) { + case 0x00: + sprintf(buf+len, "Fibre Channel\n"); + break; + case 0x10: + sprintf(buf+len, "Parallel SCSI\n"); + break; + case 0x20: + sprintf(buf+len, "SSA\n"); + break; + case 0x30: + sprintf(buf+len, "IEEE 1394\n"); + break; + case 0x40: + sprintf(buf+len, "SCSI Remote Direct Memory Access" + " Protocol\n"); + break; + case 0x50: + sprintf(buf+len, "Internet SCSI (iSCSI)\n"); + break; + case 0x60: + sprintf(buf+len, "SAS Serial SCSI Protocol\n"); + break; + case 0x70: + sprintf(buf+len, "Automation/Drive Interface Transport" + " Protocol\n"); + break; + case 0x80: + sprintf(buf+len, "AT Attachment Interface ATA/ATAPI\n"); + break; + default: + sprintf(buf+len, "Unknown 0x%02x\n", + vpd->protocol_identifier); + break; + } + + if (p_buf) + strncpy(p_buf, buf, p_buf_len); + else + printk(KERN_INFO "%s", buf); +} + +void +transport_set_vpd_proto_id(struct t10_vpd *vpd, unsigned char *page_83) +{ + /* + * Check if the Protocol Identifier Valid (PIV) bit is set.. + * + * from spc3r23.pdf section 7.5.1 + */ + if (page_83[1] & 0x80) { + vpd->protocol_identifier = (page_83[0] & 0xf0); + vpd->protocol_identifier_set = 1; + transport_dump_vpd_proto_id(vpd, NULL, 0); + } +} +EXPORT_SYMBOL(transport_set_vpd_proto_id); + +int transport_dump_vpd_assoc( + struct t10_vpd *vpd, + unsigned char *p_buf, + int p_buf_len) +{ + unsigned char buf[VPD_TMP_BUF_SIZE]; + int ret = 0, len; + + memset(buf, 0, VPD_TMP_BUF_SIZE); + len = sprintf(buf, "T10 VPD Identifier Association: "); + + switch (vpd->association) { + case 0x00: + sprintf(buf+len, "addressed logical unit\n"); + break; + case 0x10: + sprintf(buf+len, "target port\n"); + break; + case 0x20: + sprintf(buf+len, "SCSI target device\n"); + break; + default: + sprintf(buf+len, "Unknown 0x%02x\n", vpd->association); + ret = -1; + break; + } + + if (p_buf) + strncpy(p_buf, buf, p_buf_len); + else + printk("%s", buf); + + return ret; +} + +int transport_set_vpd_assoc(struct t10_vpd *vpd, unsigned char *page_83) +{ + /* + * The VPD identification association.. + * + * from spc3r23.pdf Section 7.6.3.1 Table 297 + */ + vpd->association = (page_83[1] & 0x30); + return transport_dump_vpd_assoc(vpd, NULL, 0); +} +EXPORT_SYMBOL(transport_set_vpd_assoc); + +int transport_dump_vpd_ident_type( + struct t10_vpd *vpd, + unsigned char *p_buf, + int p_buf_len) +{ + unsigned char buf[VPD_TMP_BUF_SIZE]; + int ret = 0, len; + + memset(buf, 0, VPD_TMP_BUF_SIZE); + len = sprintf(buf, "T10 VPD Identifier Type: "); + + switch (vpd->device_identifier_type) { + case 0x00: + sprintf(buf+len, "Vendor specific\n"); + break; + case 0x01: + sprintf(buf+len, "T10 Vendor ID based\n"); + break; + case 0x02: + sprintf(buf+len, "EUI-64 based\n"); + break; + case 0x03: + sprintf(buf+len, "NAA\n"); + break; + case 0x04: + sprintf(buf+len, "Relative target port identifier\n"); + break; + case 0x08: + sprintf(buf+len, "SCSI name string\n"); + break; + default: + sprintf(buf+len, "Unsupported: 0x%02x\n", + vpd->device_identifier_type); + ret = -1; + break; + } + + if (p_buf) + strncpy(p_buf, buf, p_buf_len); + else + printk("%s", buf); + + return ret; +} + +int transport_set_vpd_ident_type(struct t10_vpd *vpd, unsigned char *page_83) +{ + /* + * The VPD identifier type.. + * + * from spc3r23.pdf Section 7.6.3.1 Table 298 + */ + vpd->device_identifier_type = (page_83[1] & 0x0f); + return transport_dump_vpd_ident_type(vpd, NULL, 0); +} +EXPORT_SYMBOL(transport_set_vpd_ident_type); + +int transport_dump_vpd_ident( + struct t10_vpd *vpd, + unsigned char *p_buf, + int p_buf_len) +{ + unsigned char buf[VPD_TMP_BUF_SIZE]; + int ret = 0; + + memset(buf, 0, VPD_TMP_BUF_SIZE); + + switch (vpd->device_identifier_code_set) { + case 0x01: /* Binary */ + sprintf(buf, "T10 VPD Binary Device Identifier: %s\n", + &vpd->device_identifier[0]); + break; + case 0x02: /* ASCII */ + sprintf(buf, "T10 VPD ASCII Device Identifier: %s\n", + &vpd->device_identifier[0]); + break; + case 0x03: /* UTF-8 */ + sprintf(buf, "T10 VPD UTF-8 Device Identifier: %s\n", + &vpd->device_identifier[0]); + break; + default: + sprintf(buf, "T10 VPD Device Identifier encoding unsupported:" + " 0x%02x", vpd->device_identifier_code_set); + ret = -1; + break; + } + + if (p_buf) + strncpy(p_buf, buf, p_buf_len); + else + printk("%s", buf); + + return ret; +} + +int +transport_set_vpd_ident(struct t10_vpd *vpd, unsigned char *page_83) +{ + static const char hex_str[] = "0123456789abcdef"; + int j = 0, i = 4; /* offset to start of the identifer */ + + /* + * The VPD Code Set (encoding) + * + * from spc3r23.pdf Section 7.6.3.1 Table 296 + */ + vpd->device_identifier_code_set = (page_83[0] & 0x0f); + switch (vpd->device_identifier_code_set) { + case 0x01: /* Binary */ + vpd->device_identifier[j++] = + hex_str[vpd->device_identifier_type]; + while (i < (4 + page_83[3])) { + vpd->device_identifier[j++] = + hex_str[(page_83[i] & 0xf0) >> 4]; + vpd->device_identifier[j++] = + hex_str[page_83[i] & 0x0f]; + i++; + } + break; + case 0x02: /* ASCII */ + case 0x03: /* UTF-8 */ + while (i < (4 + page_83[3])) + vpd->device_identifier[j++] = page_83[i++]; + break; + default: + break; + } + + return transport_dump_vpd_ident(vpd, NULL, 0); +} +EXPORT_SYMBOL(transport_set_vpd_ident); + +static void core_setup_task_attr_emulation(struct se_device *dev) +{ + /* + * If this device is from Target_Core_Mod/pSCSI, disable the + * SAM Task Attribute emulation. + * + * This is currently not available in upsream Linux/SCSI Target + * mode code, and is assumed to be disabled while using TCM/pSCSI. + */ + if (TRANSPORT(dev)->transport_type == TRANSPORT_PLUGIN_PHBA_PDEV) { + dev->dev_task_attr_type = SAM_TASK_ATTR_PASSTHROUGH; + return; + } + + dev->dev_task_attr_type = SAM_TASK_ATTR_EMULATED; + DEBUG_STA("%s: Using SAM_TASK_ATTR_EMULATED for SPC: 0x%02x" + " device\n", TRANSPORT(dev)->name, + TRANSPORT(dev)->get_device_rev(dev)); +} + +static void scsi_dump_inquiry(struct se_device *dev) +{ + struct t10_wwn *wwn = DEV_T10_WWN(dev); + int i, device_type; + /* + * Print Linux/SCSI style INQUIRY formatting to the kernel ring buffer + */ + printk(" Vendor: "); + for (i = 0; i < 8; i++) + if (wwn->vendor[i] >= 0x20) + printk("%c", wwn->vendor[i]); + else + printk(" "); + + printk(" Model: "); + for (i = 0; i < 16; i++) + if (wwn->model[i] >= 0x20) + printk("%c", wwn->model[i]); + else + printk(" "); + + printk(" Revision: "); + for (i = 0; i < 4; i++) + if (wwn->revision[i] >= 0x20) + printk("%c", wwn->revision[i]); + else + printk(" "); + + printk("\n"); + + device_type = TRANSPORT(dev)->get_device_type(dev); + printk(" Type: %s ", scsi_device_type(device_type)); + printk(" ANSI SCSI revision: %02x\n", + TRANSPORT(dev)->get_device_rev(dev)); +} + +struct se_device *transport_add_device_to_core_hba( + struct se_hba *hba, + struct se_subsystem_api *transport, + struct se_subsystem_dev *se_dev, + u32 device_flags, + void *transport_dev, + struct se_dev_limits *dev_limits, + const char *inquiry_prod, + const char *inquiry_rev) +{ + int ret = 0, force_pt; + struct se_device *dev; + + dev = kzalloc(sizeof(struct se_device), GFP_KERNEL); + if (!(dev)) { + printk(KERN_ERR "Unable to allocate memory for se_dev_t\n"); + return NULL; + } + dev->dev_queue_obj = kzalloc(sizeof(struct se_queue_obj), GFP_KERNEL); + if (!(dev->dev_queue_obj)) { + printk(KERN_ERR "Unable to allocate memory for" + " dev->dev_queue_obj\n"); + kfree(dev); + return NULL; + } + transport_init_queue_obj(dev->dev_queue_obj); + + dev->dev_status_queue_obj = kzalloc(sizeof(struct se_queue_obj), + GFP_KERNEL); + if (!(dev->dev_status_queue_obj)) { + printk(KERN_ERR "Unable to allocate memory for" + " dev->dev_status_queue_obj\n"); + kfree(dev->dev_queue_obj); + kfree(dev); + return NULL; + } + transport_init_queue_obj(dev->dev_status_queue_obj); + + dev->dev_flags = device_flags; + dev->dev_status |= TRANSPORT_DEVICE_DEACTIVATED; + dev->dev_ptr = (void *) transport_dev; + dev->se_hba = hba; + dev->se_sub_dev = se_dev; + dev->transport = transport; + atomic_set(&dev->active_cmds, 0); + INIT_LIST_HEAD(&dev->dev_list); + INIT_LIST_HEAD(&dev->dev_sep_list); + INIT_LIST_HEAD(&dev->dev_tmr_list); + INIT_LIST_HEAD(&dev->execute_task_list); + INIT_LIST_HEAD(&dev->delayed_cmd_list); + INIT_LIST_HEAD(&dev->ordered_cmd_list); + INIT_LIST_HEAD(&dev->state_task_list); + spin_lock_init(&dev->execute_task_lock); + spin_lock_init(&dev->delayed_cmd_lock); + spin_lock_init(&dev->ordered_cmd_lock); + spin_lock_init(&dev->state_task_lock); + spin_lock_init(&dev->dev_alua_lock); + spin_lock_init(&dev->dev_reservation_lock); + spin_lock_init(&dev->dev_status_lock); + spin_lock_init(&dev->dev_status_thr_lock); + spin_lock_init(&dev->se_port_lock); + spin_lock_init(&dev->se_tmr_lock); + + dev->queue_depth = dev_limits->queue_depth; + atomic_set(&dev->depth_left, dev->queue_depth); + atomic_set(&dev->dev_ordered_id, 0); + + se_dev_set_default_attribs(dev, dev_limits); + + dev->dev_index = scsi_get_new_index(SCSI_DEVICE_INDEX); + dev->creation_time = get_jiffies_64(); + spin_lock_init(&dev->stats_lock); + + spin_lock(&hba->device_lock); + list_add_tail(&dev->dev_list, &hba->hba_dev_list); + hba->dev_count++; + spin_unlock(&hba->device_lock); + /* + * Setup the SAM Task Attribute emulation for struct se_device + */ + core_setup_task_attr_emulation(dev); + /* + * Force PR and ALUA passthrough emulation with internal object use. + */ + force_pt = (hba->hba_flags & HBA_FLAGS_INTERNAL_USE); + /* + * Setup the Reservations infrastructure for struct se_device + */ + core_setup_reservations(dev, force_pt); + /* + * Setup the Asymmetric Logical Unit Assignment for struct se_device + */ + if (core_setup_alua(dev, force_pt) < 0) + goto out; + + /* + * Startup the struct se_device processing thread + */ + dev->process_thread = kthread_run(transport_processing_thread, dev, + "LIO_%s", TRANSPORT(dev)->name); + if (IS_ERR(dev->process_thread)) { + printk(KERN_ERR "Unable to create kthread: LIO_%s\n", + TRANSPORT(dev)->name); + goto out; + } + + /* + * Preload the initial INQUIRY const values if we are doing + * anything virtual (IBLOCK, FILEIO, RAMDISK), but not for TCM/pSCSI + * passthrough because this is being provided by the backend LLD. + * This is required so that transport_get_inquiry() copies these + * originals once back into DEV_T10_WWN(dev) for the virtual device + * setup. + */ + if (TRANSPORT(dev)->transport_type != TRANSPORT_PLUGIN_PHBA_PDEV) { + if (!(inquiry_prod) || !(inquiry_prod)) { + printk(KERN_ERR "All non TCM/pSCSI plugins require" + " INQUIRY consts\n"); + goto out; + } + + strncpy(&DEV_T10_WWN(dev)->vendor[0], "LIO-ORG", 8); + strncpy(&DEV_T10_WWN(dev)->model[0], inquiry_prod, 16); + strncpy(&DEV_T10_WWN(dev)->revision[0], inquiry_rev, 4); + } + scsi_dump_inquiry(dev); + +out: + if (!ret) + return dev; + kthread_stop(dev->process_thread); + + spin_lock(&hba->device_lock); + list_del(&dev->dev_list); + hba->dev_count--; + spin_unlock(&hba->device_lock); + + se_release_vpd_for_dev(dev); + + kfree(dev->dev_status_queue_obj); + kfree(dev->dev_queue_obj); + kfree(dev); + + return NULL; +} +EXPORT_SYMBOL(transport_add_device_to_core_hba); + +/* transport_generic_prepare_cdb(): + * + * Since the Initiator sees iSCSI devices as LUNs, the SCSI CDB will + * contain the iSCSI LUN in bits 7-5 of byte 1 as per SAM-2. + * The point of this is since we are mapping iSCSI LUNs to + * SCSI Target IDs having a non-zero LUN in the CDB will throw the + * devices and HBAs for a loop. + */ +static inline void transport_generic_prepare_cdb( + unsigned char *cdb) +{ + switch (cdb[0]) { + case READ_10: /* SBC - RDProtect */ + case READ_12: /* SBC - RDProtect */ + case READ_16: /* SBC - RDProtect */ + case SEND_DIAGNOSTIC: /* SPC - SELF-TEST Code */ + case VERIFY: /* SBC - VRProtect */ + case VERIFY_16: /* SBC - VRProtect */ + case WRITE_VERIFY: /* SBC - VRProtect */ + case WRITE_VERIFY_12: /* SBC - VRProtect */ + break; + default: + cdb[1] &= 0x1f; /* clear logical unit number */ + break; + } +} + +static struct se_task * +transport_generic_get_task(struct se_cmd *cmd, + enum dma_data_direction data_direction) +{ + struct se_task *task; + struct se_device *dev = SE_DEV(cmd); + unsigned long flags; + + task = dev->transport->alloc_task(cmd); + if (!task) { + printk(KERN_ERR "Unable to allocate struct se_task\n"); + return NULL; + } + + INIT_LIST_HEAD(&task->t_list); + INIT_LIST_HEAD(&task->t_execute_list); + INIT_LIST_HEAD(&task->t_state_list); + init_completion(&task->task_stop_comp); + task->task_no = T_TASK(cmd)->t_tasks_no++; + task->task_se_cmd = cmd; + task->se_dev = dev; + task->task_data_direction = data_direction; + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + list_add_tail(&task->t_list, &T_TASK(cmd)->t_task_list); + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + + return task; +} + +static int transport_generic_cmd_sequencer(struct se_cmd *, unsigned char *); + +void transport_device_setup_cmd(struct se_cmd *cmd) +{ + cmd->se_dev = SE_LUN(cmd)->lun_se_dev; +} +EXPORT_SYMBOL(transport_device_setup_cmd); + +/* + * Used by fabric modules containing a local struct se_cmd within their + * fabric dependent per I/O descriptor. + */ +void transport_init_se_cmd( + struct se_cmd *cmd, + struct target_core_fabric_ops *tfo, + struct se_session *se_sess, + u32 data_length, + int data_direction, + int task_attr, + unsigned char *sense_buffer) +{ + INIT_LIST_HEAD(&cmd->se_lun_list); + INIT_LIST_HEAD(&cmd->se_delayed_list); + INIT_LIST_HEAD(&cmd->se_ordered_list); + /* + * Setup t_task pointer to t_task_backstore + */ + cmd->t_task = &cmd->t_task_backstore; + + INIT_LIST_HEAD(&T_TASK(cmd)->t_task_list); + init_completion(&T_TASK(cmd)->transport_lun_fe_stop_comp); + init_completion(&T_TASK(cmd)->transport_lun_stop_comp); + init_completion(&T_TASK(cmd)->t_transport_stop_comp); + spin_lock_init(&T_TASK(cmd)->t_state_lock); + atomic_set(&T_TASK(cmd)->transport_dev_active, 1); + + cmd->se_tfo = tfo; + cmd->se_sess = se_sess; + cmd->data_length = data_length; + cmd->data_direction = data_direction; + cmd->sam_task_attr = task_attr; + cmd->sense_buffer = sense_buffer; +} +EXPORT_SYMBOL(transport_init_se_cmd); + +static int transport_check_alloc_task_attr(struct se_cmd *cmd) +{ + /* + * Check if SAM Task Attribute emulation is enabled for this + * struct se_device storage object + */ + if (SE_DEV(cmd)->dev_task_attr_type != SAM_TASK_ATTR_EMULATED) + return 0; + + if (cmd->sam_task_attr == TASK_ATTR_ACA) { + DEBUG_STA("SAM Task Attribute ACA" + " emulation is not supported\n"); + return -1; + } + /* + * Used to determine when ORDERED commands should go from + * Dormant to Active status. + */ + cmd->se_ordered_id = atomic_inc_return(&SE_DEV(cmd)->dev_ordered_id); + smp_mb__after_atomic_inc(); + DEBUG_STA("Allocated se_ordered_id: %u for Task Attr: 0x%02x on %s\n", + cmd->se_ordered_id, cmd->sam_task_attr, + TRANSPORT(cmd->se_dev)->name); + return 0; +} + +void transport_free_se_cmd( + struct se_cmd *se_cmd) +{ + if (se_cmd->se_tmr_req) + core_tmr_release_req(se_cmd->se_tmr_req); + /* + * Check and free any extended CDB buffer that was allocated + */ + if (T_TASK(se_cmd)->t_task_cdb != T_TASK(se_cmd)->__t_task_cdb) + kfree(T_TASK(se_cmd)->t_task_cdb); +} +EXPORT_SYMBOL(transport_free_se_cmd); + +static void transport_generic_wait_for_tasks(struct se_cmd *, int, int); + +/* transport_generic_allocate_tasks(): + * + * Called from fabric RX Thread. + */ +int transport_generic_allocate_tasks( + struct se_cmd *cmd, + unsigned char *cdb) +{ + int ret; + + transport_generic_prepare_cdb(cdb); + + /* + * This is needed for early exceptions. + */ + cmd->transport_wait_for_tasks = &transport_generic_wait_for_tasks; + + transport_device_setup_cmd(cmd); + /* + * Ensure that the received CDB is less than the max (252 + 8) bytes + * for VARIABLE_LENGTH_CMD + */ + if (scsi_command_size(cdb) > SCSI_MAX_VARLEN_CDB_SIZE) { + printk(KERN_ERR "Received SCSI CDB with command_size: %d that" + " exceeds SCSI_MAX_VARLEN_CDB_SIZE: %d\n", + scsi_command_size(cdb), SCSI_MAX_VARLEN_CDB_SIZE); + return -1; + } + /* + * If the received CDB is larger than TCM_MAX_COMMAND_SIZE, + * allocate the additional extended CDB buffer now.. Otherwise + * setup the pointer from __t_task_cdb to t_task_cdb. + */ + if (scsi_command_size(cdb) > sizeof(T_TASK(cmd)->__t_task_cdb)) { + T_TASK(cmd)->t_task_cdb = kzalloc(scsi_command_size(cdb), + GFP_KERNEL); + if (!(T_TASK(cmd)->t_task_cdb)) { + printk(KERN_ERR "Unable to allocate T_TASK(cmd)->t_task_cdb" + " %u > sizeof(T_TASK(cmd)->__t_task_cdb): %lu ops\n", + scsi_command_size(cdb), + (unsigned long)sizeof(T_TASK(cmd)->__t_task_cdb)); + return -1; + } + } else + T_TASK(cmd)->t_task_cdb = &T_TASK(cmd)->__t_task_cdb[0]; + /* + * Copy the original CDB into T_TASK(cmd). + */ + memcpy(T_TASK(cmd)->t_task_cdb, cdb, scsi_command_size(cdb)); + /* + * Setup the received CDB based on SCSI defined opcodes and + * perform unit attention, persistent reservations and ALUA + * checks for virtual device backends. The T_TASK(cmd)->t_task_cdb + * pointer is expected to be setup before we reach this point. + */ + ret = transport_generic_cmd_sequencer(cmd, cdb); + if (ret < 0) + return ret; + /* + * Check for SAM Task Attribute Emulation + */ + if (transport_check_alloc_task_attr(cmd) < 0) { + cmd->se_cmd_flags |= SCF_SCSI_CDB_EXCEPTION; + cmd->scsi_sense_reason = TCM_INVALID_CDB_FIELD; + return -2; + } + spin_lock(&cmd->se_lun->lun_sep_lock); + if (cmd->se_lun->lun_sep) + cmd->se_lun->lun_sep->sep_stats.cmd_pdus++; + spin_unlock(&cmd->se_lun->lun_sep_lock); + return 0; +} +EXPORT_SYMBOL(transport_generic_allocate_tasks); + +/* + * Used by fabric module frontends not defining a TFO->new_cmd_map() + * to queue up a newly setup se_cmd w/ TRANSPORT_NEW_CMD statis + */ +int transport_generic_handle_cdb( + struct se_cmd *cmd) +{ + if (!SE_LUN(cmd)) { + dump_stack(); + printk(KERN_ERR "SE_LUN(cmd) is NULL\n"); + return -1; + } + + transport_add_cmd_to_queue(cmd, TRANSPORT_NEW_CMD); + return 0; +} +EXPORT_SYMBOL(transport_generic_handle_cdb); + +/* + * Used by fabric module frontends defining a TFO->new_cmd_map() caller + * to queue up a newly setup se_cmd w/ TRANSPORT_NEW_CMD_MAP in order to + * complete setup in TCM process context w/ TFO->new_cmd_map(). + */ +int transport_generic_handle_cdb_map( + struct se_cmd *cmd) +{ + if (!SE_LUN(cmd)) { + dump_stack(); + printk(KERN_ERR "SE_LUN(cmd) is NULL\n"); + return -1; + } + + transport_add_cmd_to_queue(cmd, TRANSPORT_NEW_CMD_MAP); + return 0; +} +EXPORT_SYMBOL(transport_generic_handle_cdb_map); + +/* transport_generic_handle_data(): + * + * + */ +int transport_generic_handle_data( + struct se_cmd *cmd) +{ + /* + * For the software fabric case, then we assume the nexus is being + * failed/shutdown when signals are pending from the kthread context + * caller, so we return a failure. For the HW target mode case running + * in interrupt code, the signal_pending() check is skipped. + */ + if (!in_interrupt() && signal_pending(current)) + return -1; + /* + * If the received CDB has aleady been ABORTED by the generic + * target engine, we now call transport_check_aborted_status() + * to queue any delated TASK_ABORTED status for the received CDB to the + * fabric module as we are expecting no futher incoming DATA OUT + * sequences at this point. + */ + if (transport_check_aborted_status(cmd, 1) != 0) + return 0; + + transport_add_cmd_to_queue(cmd, TRANSPORT_PROCESS_WRITE); + return 0; +} +EXPORT_SYMBOL(transport_generic_handle_data); + +/* transport_generic_handle_tmr(): + * + * + */ +int transport_generic_handle_tmr( + struct se_cmd *cmd) +{ + /* + * This is needed for early exceptions. + */ + cmd->transport_wait_for_tasks = &transport_generic_wait_for_tasks; + transport_device_setup_cmd(cmd); + + transport_add_cmd_to_queue(cmd, TRANSPORT_PROCESS_TMR); + return 0; +} +EXPORT_SYMBOL(transport_generic_handle_tmr); + +static int transport_stop_tasks_for_cmd(struct se_cmd *cmd) +{ + struct se_task *task, *task_tmp; + unsigned long flags; + int ret = 0; + + DEBUG_TS("ITT[0x%08x] - Stopping tasks\n", + CMD_TFO(cmd)->get_task_tag(cmd)); + + /* + * No tasks remain in the execution queue + */ + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + list_for_each_entry_safe(task, task_tmp, + &T_TASK(cmd)->t_task_list, t_list) { + DEBUG_TS("task_no[%d] - Processing task %p\n", + task->task_no, task); + /* + * If the struct se_task has not been sent and is not active, + * remove the struct se_task from the execution queue. + */ + if (!atomic_read(&task->task_sent) && + !atomic_read(&task->task_active)) { + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, + flags); + transport_remove_task_from_execute_queue(task, + task->se_dev); + + DEBUG_TS("task_no[%d] - Removed from execute queue\n", + task->task_no); + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + continue; + } + + /* + * If the struct se_task is active, sleep until it is returned + * from the plugin. + */ + if (atomic_read(&task->task_active)) { + atomic_set(&task->task_stop, 1); + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, + flags); + + DEBUG_TS("task_no[%d] - Waiting to complete\n", + task->task_no); + wait_for_completion(&task->task_stop_comp); + DEBUG_TS("task_no[%d] - Stopped successfully\n", + task->task_no); + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + atomic_dec(&T_TASK(cmd)->t_task_cdbs_left); + + atomic_set(&task->task_active, 0); + atomic_set(&task->task_stop, 0); + } else { + DEBUG_TS("task_no[%d] - Did nothing\n", task->task_no); + ret++; + } + + __transport_stop_task_timer(task, &flags); + } + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + + return ret; +} + +static void transport_failure_reset_queue_depth(struct se_device *dev) +{ + unsigned long flags; + + spin_lock_irqsave(&SE_HBA(dev)->hba_queue_lock, flags);; + atomic_inc(&dev->depth_left); + atomic_inc(&SE_HBA(dev)->left_queue_depth); + spin_unlock_irqrestore(&SE_HBA(dev)->hba_queue_lock, flags); +} + +/* + * Handle SAM-esque emulation for generic transport request failures. + */ +static void transport_generic_request_failure( + struct se_cmd *cmd, + struct se_device *dev, + int complete, + int sc) +{ + DEBUG_GRF("-----[ Storage Engine Exception for cmd: %p ITT: 0x%08x" + " CDB: 0x%02x\n", cmd, CMD_TFO(cmd)->get_task_tag(cmd), + T_TASK(cmd)->t_task_cdb[0]); + DEBUG_GRF("-----[ i_state: %d t_state/def_t_state:" + " %d/%d transport_error_status: %d\n", + CMD_TFO(cmd)->get_cmd_state(cmd), + cmd->t_state, cmd->deferred_t_state, + cmd->transport_error_status); + DEBUG_GRF("-----[ t_task_cdbs: %d t_task_cdbs_left: %d" + " t_task_cdbs_sent: %d t_task_cdbs_ex_left: %d --" + " t_transport_active: %d t_transport_stop: %d" + " t_transport_sent: %d\n", T_TASK(cmd)->t_task_cdbs, + atomic_read(&T_TASK(cmd)->t_task_cdbs_left), + atomic_read(&T_TASK(cmd)->t_task_cdbs_sent), + atomic_read(&T_TASK(cmd)->t_task_cdbs_ex_left), + atomic_read(&T_TASK(cmd)->t_transport_active), + atomic_read(&T_TASK(cmd)->t_transport_stop), + atomic_read(&T_TASK(cmd)->t_transport_sent)); + + transport_stop_all_task_timers(cmd); + + if (dev) + transport_failure_reset_queue_depth(dev); + /* + * For SAM Task Attribute emulation for failed struct se_cmd + */ + if (cmd->se_dev->dev_task_attr_type == SAM_TASK_ATTR_EMULATED) + transport_complete_task_attr(cmd); + + if (complete) { + transport_direct_request_timeout(cmd); + cmd->transport_error_status = PYX_TRANSPORT_LU_COMM_FAILURE; + } + + switch (cmd->transport_error_status) { + case PYX_TRANSPORT_UNKNOWN_SAM_OPCODE: + cmd->scsi_sense_reason = TCM_UNSUPPORTED_SCSI_OPCODE; + break; + case PYX_TRANSPORT_REQ_TOO_MANY_SECTORS: + cmd->scsi_sense_reason = TCM_SECTOR_COUNT_TOO_MANY; + break; + case PYX_TRANSPORT_INVALID_CDB_FIELD: + cmd->scsi_sense_reason = TCM_INVALID_CDB_FIELD; + break; + case PYX_TRANSPORT_INVALID_PARAMETER_LIST: + cmd->scsi_sense_reason = TCM_INVALID_PARAMETER_LIST; + break; + case PYX_TRANSPORT_OUT_OF_MEMORY_RESOURCES: + if (!sc) + transport_new_cmd_failure(cmd); + /* + * Currently for PYX_TRANSPORT_OUT_OF_MEMORY_RESOURCES, + * we force this session to fall back to session + * recovery. + */ + CMD_TFO(cmd)->fall_back_to_erl0(cmd->se_sess); + CMD_TFO(cmd)->stop_session(cmd->se_sess, 0, 0); + + goto check_stop; + case PYX_TRANSPORT_LU_COMM_FAILURE: + case PYX_TRANSPORT_ILLEGAL_REQUEST: + cmd->scsi_sense_reason = TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE; + break; + case PYX_TRANSPORT_UNKNOWN_MODE_PAGE: + cmd->scsi_sense_reason = TCM_UNKNOWN_MODE_PAGE; + break; + case PYX_TRANSPORT_WRITE_PROTECTED: + cmd->scsi_sense_reason = TCM_WRITE_PROTECTED; + break; + case PYX_TRANSPORT_RESERVATION_CONFLICT: + /* + * No SENSE Data payload for this case, set SCSI Status + * and queue the response to $FABRIC_MOD. + * + * Uses linux/include/scsi/scsi.h SAM status codes defs + */ + cmd->scsi_status = SAM_STAT_RESERVATION_CONFLICT; + /* + * For UA Interlock Code 11b, a RESERVATION CONFLICT will + * establish a UNIT ATTENTION with PREVIOUS RESERVATION + * CONFLICT STATUS. + * + * See spc4r17, section 7.4.6 Control Mode Page, Table 349 + */ + if (SE_SESS(cmd) && + DEV_ATTRIB(cmd->se_dev)->emulate_ua_intlck_ctrl == 2) + core_scsi3_ua_allocate(SE_SESS(cmd)->se_node_acl, + cmd->orig_fe_lun, 0x2C, + ASCQ_2CH_PREVIOUS_RESERVATION_CONFLICT_STATUS); + + CMD_TFO(cmd)->queue_status(cmd); + goto check_stop; + case PYX_TRANSPORT_USE_SENSE_REASON: + /* + * struct se_cmd->scsi_sense_reason already set + */ + break; + default: + printk(KERN_ERR "Unknown transport error for CDB 0x%02x: %d\n", + T_TASK(cmd)->t_task_cdb[0], + cmd->transport_error_status); + cmd->scsi_sense_reason = TCM_UNSUPPORTED_SCSI_OPCODE; + break; + } + + if (!sc) + transport_new_cmd_failure(cmd); + else + transport_send_check_condition_and_sense(cmd, + cmd->scsi_sense_reason, 0); +check_stop: + transport_lun_remove_cmd(cmd); + if (!(transport_cmd_check_stop_to_fabric(cmd))) + ; +} + +static void transport_direct_request_timeout(struct se_cmd *cmd) +{ + unsigned long flags; + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + if (!(atomic_read(&T_TASK(cmd)->t_transport_timeout))) { + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + return; + } + if (atomic_read(&T_TASK(cmd)->t_task_cdbs_timeout_left)) { + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + return; + } + + atomic_sub(atomic_read(&T_TASK(cmd)->t_transport_timeout), + &T_TASK(cmd)->t_se_count); + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); +} + +static void transport_generic_request_timeout(struct se_cmd *cmd) +{ + unsigned long flags; + + /* + * Reset T_TASK(cmd)->t_se_count to allow transport_generic_remove() + * to allow last call to free memory resources. + */ + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + if (atomic_read(&T_TASK(cmd)->t_transport_timeout) > 1) { + int tmp = (atomic_read(&T_TASK(cmd)->t_transport_timeout) - 1); + + atomic_sub(tmp, &T_TASK(cmd)->t_se_count); + } + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + + transport_generic_remove(cmd, 0, 0); +} + +static int +transport_generic_allocate_buf(struct se_cmd *cmd, u32 data_length) +{ + unsigned char *buf; + + buf = kzalloc(data_length, GFP_KERNEL); + if (!(buf)) { + printk(KERN_ERR "Unable to allocate memory for buffer\n"); + return -1; + } + + T_TASK(cmd)->t_tasks_se_num = 0; + T_TASK(cmd)->t_task_buf = buf; + + return 0; +} + +static inline u32 transport_lba_21(unsigned char *cdb) +{ + return ((cdb[1] & 0x1f) << 16) | (cdb[2] << 8) | cdb[3]; +} + +static inline u32 transport_lba_32(unsigned char *cdb) +{ + return (cdb[2] << 24) | (cdb[3] << 16) | (cdb[4] << 8) | cdb[5]; +} + +static inline unsigned long long transport_lba_64(unsigned char *cdb) +{ + unsigned int __v1, __v2; + + __v1 = (cdb[2] << 24) | (cdb[3] << 16) | (cdb[4] << 8) | cdb[5]; + __v2 = (cdb[6] << 24) | (cdb[7] << 16) | (cdb[8] << 8) | cdb[9]; + + return ((unsigned long long)__v2) | (unsigned long long)__v1 << 32; +} + +/* + * For VARIABLE_LENGTH_CDB w/ 32 byte extended CDBs + */ +static inline unsigned long long transport_lba_64_ext(unsigned char *cdb) +{ + unsigned int __v1, __v2; + + __v1 = (cdb[12] << 24) | (cdb[13] << 16) | (cdb[14] << 8) | cdb[15]; + __v2 = (cdb[16] << 24) | (cdb[17] << 16) | (cdb[18] << 8) | cdb[19]; + + return ((unsigned long long)__v2) | (unsigned long long)__v1 << 32; +} + +static void transport_set_supported_SAM_opcode(struct se_cmd *se_cmd) +{ + unsigned long flags; + + spin_lock_irqsave(&T_TASK(se_cmd)->t_state_lock, flags); + se_cmd->se_cmd_flags |= SCF_SUPPORTED_SAM_OPCODE; + spin_unlock_irqrestore(&T_TASK(se_cmd)->t_state_lock, flags); +} + +/* + * Called from interrupt context. + */ +static void transport_task_timeout_handler(unsigned long data) +{ + struct se_task *task = (struct se_task *)data; + struct se_cmd *cmd = TASK_CMD(task); + unsigned long flags; + + DEBUG_TT("transport task timeout fired! task: %p cmd: %p\n", task, cmd); + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + if (task->task_flags & TF_STOP) { + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + return; + } + task->task_flags &= ~TF_RUNNING; + + /* + * Determine if transport_complete_task() has already been called. + */ + if (!(atomic_read(&task->task_active))) { + DEBUG_TT("transport task: %p cmd: %p timeout task_active" + " == 0\n", task, cmd); + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + return; + } + + atomic_inc(&T_TASK(cmd)->t_se_count); + atomic_inc(&T_TASK(cmd)->t_transport_timeout); + T_TASK(cmd)->t_tasks_failed = 1; + + atomic_set(&task->task_timeout, 1); + task->task_error_status = PYX_TRANSPORT_TASK_TIMEOUT; + task->task_scsi_status = 1; + + if (atomic_read(&task->task_stop)) { + DEBUG_TT("transport task: %p cmd: %p timeout task_stop" + " == 1\n", task, cmd); + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + complete(&task->task_stop_comp); + return; + } + + if (!(atomic_dec_and_test(&T_TASK(cmd)->t_task_cdbs_left))) { + DEBUG_TT("transport task: %p cmd: %p timeout non zero" + " t_task_cdbs_left\n", task, cmd); + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + return; + } + DEBUG_TT("transport task: %p cmd: %p timeout ZERO t_task_cdbs_left\n", + task, cmd); + + cmd->t_state = TRANSPORT_COMPLETE_FAILURE; + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + + transport_add_cmd_to_queue(cmd, TRANSPORT_COMPLETE_FAILURE); +} + +/* + * Called with T_TASK(cmd)->t_state_lock held. + */ +static void transport_start_task_timer(struct se_task *task) +{ + struct se_device *dev = task->se_dev; + int timeout; + + if (task->task_flags & TF_RUNNING) + return; + /* + * If the task_timeout is disabled, exit now. + */ + timeout = DEV_ATTRIB(dev)->task_timeout; + if (!(timeout)) + return; + + init_timer(&task->task_timer); + task->task_timer.expires = (get_jiffies_64() + timeout * HZ); + task->task_timer.data = (unsigned long) task; + task->task_timer.function = transport_task_timeout_handler; + + task->task_flags |= TF_RUNNING; + add_timer(&task->task_timer); +#if 0 + printk(KERN_INFO "Starting task timer for cmd: %p task: %p seconds:" + " %d\n", task->task_se_cmd, task, timeout); +#endif +} + +/* + * Called with spin_lock_irq(&T_TASK(cmd)->t_state_lock) held. + */ +void __transport_stop_task_timer(struct se_task *task, unsigned long *flags) +{ + struct se_cmd *cmd = TASK_CMD(task); + + if (!(task->task_flags & TF_RUNNING)) + return; + + task->task_flags |= TF_STOP; + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, *flags); + + del_timer_sync(&task->task_timer); + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, *flags); + task->task_flags &= ~TF_RUNNING; + task->task_flags &= ~TF_STOP; +} + +static void transport_stop_all_task_timers(struct se_cmd *cmd) +{ + struct se_task *task = NULL, *task_tmp; + unsigned long flags; + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + list_for_each_entry_safe(task, task_tmp, + &T_TASK(cmd)->t_task_list, t_list) + __transport_stop_task_timer(task, &flags); + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); +} + +static inline int transport_tcq_window_closed(struct se_device *dev) +{ + if (dev->dev_tcq_window_closed++ < + PYX_TRANSPORT_WINDOW_CLOSED_THRESHOLD) { + msleep(PYX_TRANSPORT_WINDOW_CLOSED_WAIT_SHORT); + } else + msleep(PYX_TRANSPORT_WINDOW_CLOSED_WAIT_LONG); + + wake_up_interruptible(&dev->dev_queue_obj->thread_wq); + return 0; +} + +/* + * Called from Fabric Module context from transport_execute_tasks() + * + * The return of this function determins if the tasks from struct se_cmd + * get added to the execution queue in transport_execute_tasks(), + * or are added to the delayed or ordered lists here. + */ +static inline int transport_execute_task_attr(struct se_cmd *cmd) +{ + if (SE_DEV(cmd)->dev_task_attr_type != SAM_TASK_ATTR_EMULATED) + return 1; + /* + * Check for the existance of HEAD_OF_QUEUE, and if true return 1 + * to allow the passed struct se_cmd list of tasks to the front of the list. + */ + if (cmd->sam_task_attr == TASK_ATTR_HOQ) { + atomic_inc(&SE_DEV(cmd)->dev_hoq_count); + smp_mb__after_atomic_inc(); + DEBUG_STA("Added HEAD_OF_QUEUE for CDB:" + " 0x%02x, se_ordered_id: %u\n", + T_TASK(cmd)->t_task_cdb[0], + cmd->se_ordered_id); + return 1; + } else if (cmd->sam_task_attr == TASK_ATTR_ORDERED) { + spin_lock(&SE_DEV(cmd)->ordered_cmd_lock); + list_add_tail(&cmd->se_ordered_list, + &SE_DEV(cmd)->ordered_cmd_list); + spin_unlock(&SE_DEV(cmd)->ordered_cmd_lock); + + atomic_inc(&SE_DEV(cmd)->dev_ordered_sync); + smp_mb__after_atomic_inc(); + + DEBUG_STA("Added ORDERED for CDB: 0x%02x to ordered" + " list, se_ordered_id: %u\n", + T_TASK(cmd)->t_task_cdb[0], + cmd->se_ordered_id); + /* + * Add ORDERED command to tail of execution queue if + * no other older commands exist that need to be + * completed first. + */ + if (!(atomic_read(&SE_DEV(cmd)->simple_cmds))) + return 1; + } else { + /* + * For SIMPLE and UNTAGGED Task Attribute commands + */ + atomic_inc(&SE_DEV(cmd)->simple_cmds); + smp_mb__after_atomic_inc(); + } + /* + * Otherwise if one or more outstanding ORDERED task attribute exist, + * add the dormant task(s) built for the passed struct se_cmd to the + * execution queue and become in Active state for this struct se_device. + */ + if (atomic_read(&SE_DEV(cmd)->dev_ordered_sync) != 0) { + /* + * Otherwise, add cmd w/ tasks to delayed cmd queue that + * will be drained upon competion of HEAD_OF_QUEUE task. + */ + spin_lock(&SE_DEV(cmd)->delayed_cmd_lock); + cmd->se_cmd_flags |= SCF_DELAYED_CMD_FROM_SAM_ATTR; + list_add_tail(&cmd->se_delayed_list, + &SE_DEV(cmd)->delayed_cmd_list); + spin_unlock(&SE_DEV(cmd)->delayed_cmd_lock); + + DEBUG_STA("Added CDB: 0x%02x Task Attr: 0x%02x to" + " delayed CMD list, se_ordered_id: %u\n", + T_TASK(cmd)->t_task_cdb[0], cmd->sam_task_attr, + cmd->se_ordered_id); + /* + * Return zero to let transport_execute_tasks() know + * not to add the delayed tasks to the execution list. + */ + return 0; + } + /* + * Otherwise, no ORDERED task attributes exist.. + */ + return 1; +} + +/* + * Called from fabric module context in transport_generic_new_cmd() and + * transport_generic_process_write() + */ +static int transport_execute_tasks(struct se_cmd *cmd) +{ + int add_tasks; + + if (!(cmd->se_cmd_flags & SCF_SE_DISABLE_ONLINE_CHECK)) { + if (se_dev_check_online(cmd->se_orig_obj_ptr) != 0) { + cmd->transport_error_status = + PYX_TRANSPORT_LU_COMM_FAILURE; + transport_generic_request_failure(cmd, NULL, 0, 1); + return 0; + } + } + /* + * Call transport_cmd_check_stop() to see if a fabric exception + * has occured that prevents execution. + */ + if (!(transport_cmd_check_stop(cmd, 0, TRANSPORT_PROCESSING))) { + /* + * Check for SAM Task Attribute emulation and HEAD_OF_QUEUE + * attribute for the tasks of the received struct se_cmd CDB + */ + add_tasks = transport_execute_task_attr(cmd); + if (add_tasks == 0) + goto execute_tasks; + /* + * This calls transport_add_tasks_from_cmd() to handle + * HEAD_OF_QUEUE ordering for SAM Task Attribute emulation + * (if enabled) in __transport_add_task_to_execute_queue() and + * transport_add_task_check_sam_attr(). + */ + transport_add_tasks_from_cmd(cmd); + } + /* + * Kick the execution queue for the cmd associated struct se_device + * storage object. + */ +execute_tasks: + __transport_execute_tasks(SE_DEV(cmd)); + return 0; +} + +/* + * Called to check struct se_device tcq depth window, and once open pull struct se_task + * from struct se_device->execute_task_list and + * + * Called from transport_processing_thread() + */ +static int __transport_execute_tasks(struct se_device *dev) +{ + int error; + struct se_cmd *cmd = NULL; + struct se_task *task; + unsigned long flags; + + /* + * Check if there is enough room in the device and HBA queue to send + * struct se_transport_task's to the selected transport. + */ +check_depth: + spin_lock_irqsave(&SE_HBA(dev)->hba_queue_lock, flags); + if (!(atomic_read(&dev->depth_left)) || + !(atomic_read(&SE_HBA(dev)->left_queue_depth))) { + spin_unlock_irqrestore(&SE_HBA(dev)->hba_queue_lock, flags); + return transport_tcq_window_closed(dev); + } + dev->dev_tcq_window_closed = 0; + + spin_lock(&dev->execute_task_lock); + task = transport_get_task_from_execute_queue(dev); + spin_unlock(&dev->execute_task_lock); + + if (!task) { + spin_unlock_irqrestore(&SE_HBA(dev)->hba_queue_lock, flags); + return 0; + } + + atomic_dec(&dev->depth_left); + atomic_dec(&SE_HBA(dev)->left_queue_depth); + spin_unlock_irqrestore(&SE_HBA(dev)->hba_queue_lock, flags); + + cmd = TASK_CMD(task); + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + atomic_set(&task->task_active, 1); + atomic_set(&task->task_sent, 1); + atomic_inc(&T_TASK(cmd)->t_task_cdbs_sent); + + if (atomic_read(&T_TASK(cmd)->t_task_cdbs_sent) == + T_TASK(cmd)->t_task_cdbs) + atomic_set(&cmd->transport_sent, 1); + + transport_start_task_timer(task); + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + /* + * The struct se_cmd->transport_emulate_cdb() function pointer is used + * to grab REPORT_LUNS CDBs before they hit the + * struct se_subsystem_api->do_task() caller below. + */ + if (cmd->transport_emulate_cdb) { + error = cmd->transport_emulate_cdb(cmd); + if (error != 0) { + cmd->transport_error_status = error; + atomic_set(&task->task_active, 0); + atomic_set(&cmd->transport_sent, 0); + transport_stop_tasks_for_cmd(cmd); + transport_generic_request_failure(cmd, dev, 0, 1); + goto check_depth; + } + /* + * Handle the successful completion for transport_emulate_cdb() + * for synchronous operation, following SCF_EMULATE_CDB_ASYNC + * Otherwise the caller is expected to complete the task with + * proper status. + */ + if (!(cmd->se_cmd_flags & SCF_EMULATE_CDB_ASYNC)) { + cmd->scsi_status = SAM_STAT_GOOD; + task->task_scsi_status = GOOD; + transport_complete_task(task, 1); + } + } else { + /* + * Currently for all virtual TCM plugins including IBLOCK, FILEIO and + * RAMDISK we use the internal transport_emulate_control_cdb() logic + * with struct se_subsystem_api callers for the primary SPC-3 TYPE_DISK + * LUN emulation code. + * + * For TCM/pSCSI and all other SCF_SCSI_DATA_SG_IO_CDB I/O tasks we + * call ->do_task() directly and let the underlying TCM subsystem plugin + * code handle the CDB emulation. + */ + if ((TRANSPORT(dev)->transport_type != TRANSPORT_PLUGIN_PHBA_PDEV) && + (!(TASK_CMD(task)->se_cmd_flags & SCF_SCSI_DATA_SG_IO_CDB))) + error = transport_emulate_control_cdb(task); + else + error = TRANSPORT(dev)->do_task(task); + + if (error != 0) { + cmd->transport_error_status = error; + atomic_set(&task->task_active, 0); + atomic_set(&cmd->transport_sent, 0); + transport_stop_tasks_for_cmd(cmd); + transport_generic_request_failure(cmd, dev, 0, 1); + } + } + + goto check_depth; + + return 0; +} + +void transport_new_cmd_failure(struct se_cmd *se_cmd) +{ + unsigned long flags; + /* + * Any unsolicited data will get dumped for failed command inside of + * the fabric plugin + */ + spin_lock_irqsave(&T_TASK(se_cmd)->t_state_lock, flags); + se_cmd->se_cmd_flags |= SCF_SE_CMD_FAILED; + se_cmd->se_cmd_flags |= SCF_SCSI_CDB_EXCEPTION; + spin_unlock_irqrestore(&T_TASK(se_cmd)->t_state_lock, flags); + + CMD_TFO(se_cmd)->new_cmd_failure(se_cmd); +} + +static void transport_nop_wait_for_tasks(struct se_cmd *, int, int); + +static inline u32 transport_get_sectors_6( + unsigned char *cdb, + struct se_cmd *cmd, + int *ret) +{ + struct se_device *dev = SE_LUN(cmd)->lun_se_dev; + + /* + * Assume TYPE_DISK for non struct se_device objects. + * Use 8-bit sector value. + */ + if (!dev) + goto type_disk; + + /* + * Use 24-bit allocation length for TYPE_TAPE. + */ + if (TRANSPORT(dev)->get_device_type(dev) == TYPE_TAPE) + return (u32)(cdb[2] << 16) + (cdb[3] << 8) + cdb[4]; + + /* + * Everything else assume TYPE_DISK Sector CDB location. + * Use 8-bit sector value. + */ +type_disk: + return (u32)cdb[4]; +} + +static inline u32 transport_get_sectors_10( + unsigned char *cdb, + struct se_cmd *cmd, + int *ret) +{ + struct se_device *dev = SE_LUN(cmd)->lun_se_dev; + + /* + * Assume TYPE_DISK for non struct se_device objects. + * Use 16-bit sector value. + */ + if (!dev) + goto type_disk; + + /* + * XXX_10 is not defined in SSC, throw an exception + */ + if (TRANSPORT(dev)->get_device_type(dev) == TYPE_TAPE) { + *ret = -1; + return 0; + } + + /* + * Everything else assume TYPE_DISK Sector CDB location. + * Use 16-bit sector value. + */ +type_disk: + return (u32)(cdb[7] << 8) + cdb[8]; +} + +static inline u32 transport_get_sectors_12( + unsigned char *cdb, + struct se_cmd *cmd, + int *ret) +{ + struct se_device *dev = SE_LUN(cmd)->lun_se_dev; + + /* + * Assume TYPE_DISK for non struct se_device objects. + * Use 32-bit sector value. + */ + if (!dev) + goto type_disk; + + /* + * XXX_12 is not defined in SSC, throw an exception + */ + if (TRANSPORT(dev)->get_device_type(dev) == TYPE_TAPE) { + *ret = -1; + return 0; + } + + /* + * Everything else assume TYPE_DISK Sector CDB location. + * Use 32-bit sector value. + */ +type_disk: + return (u32)(cdb[6] << 24) + (cdb[7] << 16) + (cdb[8] << 8) + cdb[9]; +} + +static inline u32 transport_get_sectors_16( + unsigned char *cdb, + struct se_cmd *cmd, + int *ret) +{ + struct se_device *dev = SE_LUN(cmd)->lun_se_dev; + + /* + * Assume TYPE_DISK for non struct se_device objects. + * Use 32-bit sector value. + */ + if (!dev) + goto type_disk; + + /* + * Use 24-bit allocation length for TYPE_TAPE. + */ + if (TRANSPORT(dev)->get_device_type(dev) == TYPE_TAPE) + return (u32)(cdb[12] << 16) + (cdb[13] << 8) + cdb[14]; + +type_disk: + return (u32)(cdb[10] << 24) + (cdb[11] << 16) + + (cdb[12] << 8) + cdb[13]; +} + +/* + * Used for VARIABLE_LENGTH_CDB WRITE_32 and READ_32 variants + */ +static inline u32 transport_get_sectors_32( + unsigned char *cdb, + struct se_cmd *cmd, + int *ret) +{ + /* + * Assume TYPE_DISK for non struct se_device objects. + * Use 32-bit sector value. + */ + return (u32)(cdb[28] << 24) + (cdb[29] << 16) + + (cdb[30] << 8) + cdb[31]; + +} + +static inline u32 transport_get_size( + u32 sectors, + unsigned char *cdb, + struct se_cmd *cmd) +{ + struct se_device *dev = SE_DEV(cmd); + + if (TRANSPORT(dev)->get_device_type(dev) == TYPE_TAPE) { + if (cdb[1] & 1) { /* sectors */ + return DEV_ATTRIB(dev)->block_size * sectors; + } else /* bytes */ + return sectors; + } +#if 0 + printk(KERN_INFO "Returning block_size: %u, sectors: %u == %u for" + " %s object\n", DEV_ATTRIB(dev)->block_size, sectors, + DEV_ATTRIB(dev)->block_size * sectors, + TRANSPORT(dev)->name); +#endif + return DEV_ATTRIB(dev)->block_size * sectors; +} + +unsigned char transport_asciihex_to_binaryhex(unsigned char val[2]) +{ + unsigned char result = 0; + /* + * MSB + */ + if ((val[0] >= 'a') && (val[0] <= 'f')) + result = ((val[0] - 'a' + 10) & 0xf) << 4; + else + if ((val[0] >= 'A') && (val[0] <= 'F')) + result = ((val[0] - 'A' + 10) & 0xf) << 4; + else /* digit */ + result = ((val[0] - '0') & 0xf) << 4; + /* + * LSB + */ + if ((val[1] >= 'a') && (val[1] <= 'f')) + result |= ((val[1] - 'a' + 10) & 0xf); + else + if ((val[1] >= 'A') && (val[1] <= 'F')) + result |= ((val[1] - 'A' + 10) & 0xf); + else /* digit */ + result |= ((val[1] - '0') & 0xf); + + return result; +} +EXPORT_SYMBOL(transport_asciihex_to_binaryhex); + +static void transport_xor_callback(struct se_cmd *cmd) +{ + unsigned char *buf, *addr; + struct se_mem *se_mem; + unsigned int offset; + int i; + /* + * From sbc3r22.pdf section 5.48 XDWRITEREAD (10) command + * + * 1) read the specified logical block(s); + * 2) transfer logical blocks from the data-out buffer; + * 3) XOR the logical blocks transferred from the data-out buffer with + * the logical blocks read, storing the resulting XOR data in a buffer; + * 4) if the DISABLE WRITE bit is set to zero, then write the logical + * blocks transferred from the data-out buffer; and + * 5) transfer the resulting XOR data to the data-in buffer. + */ + buf = kmalloc(cmd->data_length, GFP_KERNEL); + if (!(buf)) { + printk(KERN_ERR "Unable to allocate xor_callback buf\n"); + return; + } + /* + * Copy the scatterlist WRITE buffer located at T_TASK(cmd)->t_mem_list + * into the locally allocated *buf + */ + transport_memcpy_se_mem_read_contig(cmd, buf, T_TASK(cmd)->t_mem_list); + /* + * Now perform the XOR against the BIDI read memory located at + * T_TASK(cmd)->t_mem_bidi_list + */ + + offset = 0; + list_for_each_entry(se_mem, T_TASK(cmd)->t_mem_bidi_list, se_list) { + addr = (unsigned char *)kmap_atomic(se_mem->se_page, KM_USER0); + if (!(addr)) + goto out; + + for (i = 0; i < se_mem->se_len; i++) + *(addr + se_mem->se_off + i) ^= *(buf + offset + i); + + offset += se_mem->se_len; + kunmap_atomic(addr, KM_USER0); + } +out: + kfree(buf); +} + +/* + * Used to obtain Sense Data from underlying Linux/SCSI struct scsi_cmnd + */ +static int transport_get_sense_data(struct se_cmd *cmd) +{ + unsigned char *buffer = cmd->sense_buffer, *sense_buffer = NULL; + struct se_device *dev; + struct se_task *task = NULL, *task_tmp; + unsigned long flags; + u32 offset = 0; + + if (!SE_LUN(cmd)) { + printk(KERN_ERR "SE_LUN(cmd) is NULL\n"); + return -1; + } + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + if (cmd->se_cmd_flags & SCF_SENT_CHECK_CONDITION) { + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + return 0; + } + + list_for_each_entry_safe(task, task_tmp, + &T_TASK(cmd)->t_task_list, t_list) { + + if (!task->task_sense) + continue; + + dev = task->se_dev; + if (!(dev)) + continue; + + if (!TRANSPORT(dev)->get_sense_buffer) { + printk(KERN_ERR "TRANSPORT(dev)->get_sense_buffer" + " is NULL\n"); + continue; + } + + sense_buffer = TRANSPORT(dev)->get_sense_buffer(task); + if (!(sense_buffer)) { + printk(KERN_ERR "ITT[0x%08x]_TASK[%d]: Unable to locate" + " sense buffer for task with sense\n", + CMD_TFO(cmd)->get_task_tag(cmd), task->task_no); + continue; + } + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + + offset = CMD_TFO(cmd)->set_fabric_sense_len(cmd, + TRANSPORT_SENSE_BUFFER); + + memcpy((void *)&buffer[offset], (void *)sense_buffer, + TRANSPORT_SENSE_BUFFER); + cmd->scsi_status = task->task_scsi_status; + /* Automatically padded */ + cmd->scsi_sense_length = + (TRANSPORT_SENSE_BUFFER + offset); + + printk(KERN_INFO "HBA_[%u]_PLUG[%s]: Set SAM STATUS: 0x%02x" + " and sense\n", + dev->se_hba->hba_id, TRANSPORT(dev)->name, + cmd->scsi_status); + return 0; + } + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + + return -1; +} + +static int transport_allocate_resources(struct se_cmd *cmd) +{ + u32 length = cmd->data_length; + + if ((cmd->se_cmd_flags & SCF_SCSI_DATA_SG_IO_CDB) || + (cmd->se_cmd_flags & SCF_SCSI_CONTROL_SG_IO_CDB)) + return transport_generic_get_mem(cmd, length, PAGE_SIZE); + else if (cmd->se_cmd_flags & SCF_SCSI_CONTROL_NONSG_IO_CDB) + return transport_generic_allocate_buf(cmd, length); + else + return 0; +} + +static int +transport_handle_reservation_conflict(struct se_cmd *cmd) +{ + cmd->transport_wait_for_tasks = &transport_nop_wait_for_tasks; + cmd->se_cmd_flags |= SCF_SCSI_CDB_EXCEPTION; + cmd->se_cmd_flags |= SCF_SCSI_RESERVATION_CONFLICT; + cmd->scsi_status = SAM_STAT_RESERVATION_CONFLICT; + /* + * For UA Interlock Code 11b, a RESERVATION CONFLICT will + * establish a UNIT ATTENTION with PREVIOUS RESERVATION + * CONFLICT STATUS. + * + * See spc4r17, section 7.4.6 Control Mode Page, Table 349 + */ + if (SE_SESS(cmd) && + DEV_ATTRIB(cmd->se_dev)->emulate_ua_intlck_ctrl == 2) + core_scsi3_ua_allocate(SE_SESS(cmd)->se_node_acl, + cmd->orig_fe_lun, 0x2C, + ASCQ_2CH_PREVIOUS_RESERVATION_CONFLICT_STATUS); + return -2; +} + +/* transport_generic_cmd_sequencer(): + * + * Generic Command Sequencer that should work for most DAS transport + * drivers. + * + * Called from transport_generic_allocate_tasks() in the $FABRIC_MOD + * RX Thread. + * + * FIXME: Need to support other SCSI OPCODES where as well. + */ +static int transport_generic_cmd_sequencer( + struct se_cmd *cmd, + unsigned char *cdb) +{ + struct se_device *dev = SE_DEV(cmd); + struct se_subsystem_dev *su_dev = dev->se_sub_dev; + int ret = 0, sector_ret = 0, passthrough; + u32 sectors = 0, size = 0, pr_reg_type = 0; + u16 service_action; + u8 alua_ascq = 0; + /* + * Check for an existing UNIT ATTENTION condition + */ + if (core_scsi3_ua_check(cmd, cdb) < 0) { + cmd->transport_wait_for_tasks = + &transport_nop_wait_for_tasks; + cmd->se_cmd_flags |= SCF_SCSI_CDB_EXCEPTION; + cmd->scsi_sense_reason = TCM_CHECK_CONDITION_UNIT_ATTENTION; + return -2; + } + /* + * Check status of Asymmetric Logical Unit Assignment port + */ + ret = T10_ALUA(su_dev)->alua_state_check(cmd, cdb, &alua_ascq); + if (ret != 0) { + cmd->transport_wait_for_tasks = &transport_nop_wait_for_tasks; + /* + * Set SCSI additional sense code (ASC) to 'LUN Not Accessable'; + * The ALUA additional sense code qualifier (ASCQ) is determined + * by the ALUA primary or secondary access state.. + */ + if (ret > 0) { +#if 0 + printk(KERN_INFO "[%s]: ALUA TG Port not available," + " SenseKey: NOT_READY, ASC/ASCQ: 0x04/0x%02x\n", + CMD_TFO(cmd)->get_fabric_name(), alua_ascq); +#endif + transport_set_sense_codes(cmd, 0x04, alua_ascq); + cmd->se_cmd_flags |= SCF_SCSI_CDB_EXCEPTION; + cmd->scsi_sense_reason = TCM_CHECK_CONDITION_NOT_READY; + return -2; + } + goto out_invalid_cdb_field; + } + /* + * Check status for SPC-3 Persistent Reservations + */ + if (T10_PR_OPS(su_dev)->t10_reservation_check(cmd, &pr_reg_type) != 0) { + if (T10_PR_OPS(su_dev)->t10_seq_non_holder( + cmd, cdb, pr_reg_type) != 0) + return transport_handle_reservation_conflict(cmd); + /* + * This means the CDB is allowed for the SCSI Initiator port + * when said port is *NOT* holding the legacy SPC-2 or + * SPC-3 Persistent Reservation. + */ + } + + switch (cdb[0]) { + case READ_6: + sectors = transport_get_sectors_6(cdb, cmd, §or_ret); + if (sector_ret) + goto out_unsupported_cdb; + size = transport_get_size(sectors, cdb, cmd); + cmd->transport_split_cdb = &split_cdb_XX_6; + T_TASK(cmd)->t_task_lba = transport_lba_21(cdb); + cmd->se_cmd_flags |= SCF_SCSI_DATA_SG_IO_CDB; + break; + case READ_10: + sectors = transport_get_sectors_10(cdb, cmd, §or_ret); + if (sector_ret) + goto out_unsupported_cdb; + size = transport_get_size(sectors, cdb, cmd); + cmd->transport_split_cdb = &split_cdb_XX_10; + T_TASK(cmd)->t_task_lba = transport_lba_32(cdb); + cmd->se_cmd_flags |= SCF_SCSI_DATA_SG_IO_CDB; + break; + case READ_12: + sectors = transport_get_sectors_12(cdb, cmd, §or_ret); + if (sector_ret) + goto out_unsupported_cdb; + size = transport_get_size(sectors, cdb, cmd); + cmd->transport_split_cdb = &split_cdb_XX_12; + T_TASK(cmd)->t_task_lba = transport_lba_32(cdb); + cmd->se_cmd_flags |= SCF_SCSI_DATA_SG_IO_CDB; + break; + case READ_16: + sectors = transport_get_sectors_16(cdb, cmd, §or_ret); + if (sector_ret) + goto out_unsupported_cdb; + size = transport_get_size(sectors, cdb, cmd); + cmd->transport_split_cdb = &split_cdb_XX_16; + T_TASK(cmd)->t_task_lba = transport_lba_64(cdb); + cmd->se_cmd_flags |= SCF_SCSI_DATA_SG_IO_CDB; + break; + case WRITE_6: + sectors = transport_get_sectors_6(cdb, cmd, §or_ret); + if (sector_ret) + goto out_unsupported_cdb; + size = transport_get_size(sectors, cdb, cmd); + cmd->transport_split_cdb = &split_cdb_XX_6; + T_TASK(cmd)->t_task_lba = transport_lba_21(cdb); + cmd->se_cmd_flags |= SCF_SCSI_DATA_SG_IO_CDB; + break; + case WRITE_10: + sectors = transport_get_sectors_10(cdb, cmd, §or_ret); + if (sector_ret) + goto out_unsupported_cdb; + size = transport_get_size(sectors, cdb, cmd); + cmd->transport_split_cdb = &split_cdb_XX_10; + T_TASK(cmd)->t_task_lba = transport_lba_32(cdb); + T_TASK(cmd)->t_tasks_fua = (cdb[1] & 0x8); + cmd->se_cmd_flags |= SCF_SCSI_DATA_SG_IO_CDB; + break; + case WRITE_12: + sectors = transport_get_sectors_12(cdb, cmd, §or_ret); + if (sector_ret) + goto out_unsupported_cdb; + size = transport_get_size(sectors, cdb, cmd); + cmd->transport_split_cdb = &split_cdb_XX_12; + T_TASK(cmd)->t_task_lba = transport_lba_32(cdb); + T_TASK(cmd)->t_tasks_fua = (cdb[1] & 0x8); + cmd->se_cmd_flags |= SCF_SCSI_DATA_SG_IO_CDB; + break; + case WRITE_16: + sectors = transport_get_sectors_16(cdb, cmd, §or_ret); + if (sector_ret) + goto out_unsupported_cdb; + size = transport_get_size(sectors, cdb, cmd); + cmd->transport_split_cdb = &split_cdb_XX_16; + T_TASK(cmd)->t_task_lba = transport_lba_64(cdb); + T_TASK(cmd)->t_tasks_fua = (cdb[1] & 0x8); + cmd->se_cmd_flags |= SCF_SCSI_DATA_SG_IO_CDB; + break; + case XDWRITEREAD_10: + if ((cmd->data_direction != DMA_TO_DEVICE) || + !(T_TASK(cmd)->t_tasks_bidi)) + goto out_invalid_cdb_field; + sectors = transport_get_sectors_10(cdb, cmd, §or_ret); + if (sector_ret) + goto out_unsupported_cdb; + size = transport_get_size(sectors, cdb, cmd); + cmd->transport_split_cdb = &split_cdb_XX_10; + T_TASK(cmd)->t_task_lba = transport_lba_32(cdb); + cmd->se_cmd_flags |= SCF_SCSI_DATA_SG_IO_CDB; + passthrough = (TRANSPORT(dev)->transport_type == + TRANSPORT_PLUGIN_PHBA_PDEV); + /* + * Skip the remaining assignments for TCM/PSCSI passthrough + */ + if (passthrough) + break; + /* + * Setup BIDI XOR callback to be run during transport_generic_complete_ok() + */ + cmd->transport_complete_callback = &transport_xor_callback; + T_TASK(cmd)->t_tasks_fua = (cdb[1] & 0x8); + break; + case VARIABLE_LENGTH_CMD: + service_action = get_unaligned_be16(&cdb[8]); + /* + * Determine if this is TCM/PSCSI device and we should disable + * internal emulation for this CDB. + */ + passthrough = (TRANSPORT(dev)->transport_type == + TRANSPORT_PLUGIN_PHBA_PDEV); + + switch (service_action) { + case XDWRITEREAD_32: + sectors = transport_get_sectors_32(cdb, cmd, §or_ret); + if (sector_ret) + goto out_unsupported_cdb; + size = transport_get_size(sectors, cdb, cmd); + /* + * Use WRITE_32 and READ_32 opcodes for the emulated + * XDWRITE_READ_32 logic. + */ + cmd->transport_split_cdb = &split_cdb_XX_32; + T_TASK(cmd)->t_task_lba = transport_lba_64_ext(cdb); + cmd->se_cmd_flags |= SCF_SCSI_DATA_SG_IO_CDB; + + /* + * Skip the remaining assignments for TCM/PSCSI passthrough + */ + if (passthrough) + break; + + /* + * Setup BIDI XOR callback to be run during + * transport_generic_complete_ok() + */ + cmd->transport_complete_callback = &transport_xor_callback; + T_TASK(cmd)->t_tasks_fua = (cdb[10] & 0x8); + break; + case WRITE_SAME_32: + sectors = transport_get_sectors_32(cdb, cmd, §or_ret); + if (sector_ret) + goto out_unsupported_cdb; + size = transport_get_size(sectors, cdb, cmd); + T_TASK(cmd)->t_task_lba = get_unaligned_be64(&cdb[12]); + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_SG_IO_CDB; + + /* + * Skip the remaining assignments for TCM/PSCSI passthrough + */ + if (passthrough) + break; + + if ((cdb[10] & 0x04) || (cdb[10] & 0x02)) { + printk(KERN_ERR "WRITE_SAME PBDATA and LBDATA" + " bits not supported for Block Discard" + " Emulation\n"); + goto out_invalid_cdb_field; + } + /* + * Currently for the emulated case we only accept + * tpws with the UNMAP=1 bit set. + */ + if (!(cdb[10] & 0x08)) { + printk(KERN_ERR "WRITE_SAME w/o UNMAP bit not" + " supported for Block Discard Emulation\n"); + goto out_invalid_cdb_field; + } + break; + default: + printk(KERN_ERR "VARIABLE_LENGTH_CMD service action" + " 0x%04x not supported\n", service_action); + goto out_unsupported_cdb; + } + break; + case 0xa3: + if (TRANSPORT(dev)->get_device_type(dev) != TYPE_ROM) { + /* MAINTENANCE_IN from SCC-2 */ + /* + * Check for emulated MI_REPORT_TARGET_PGS. + */ + if (cdb[1] == MI_REPORT_TARGET_PGS) { + cmd->transport_emulate_cdb = + (T10_ALUA(su_dev)->alua_type == + SPC3_ALUA_EMULATED) ? + &core_emulate_report_target_port_groups : + NULL; + } + size = (cdb[6] << 24) | (cdb[7] << 16) | + (cdb[8] << 8) | cdb[9]; + } else { + /* GPCMD_SEND_KEY from multi media commands */ + size = (cdb[8] << 8) + cdb[9]; + } + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_NONSG_IO_CDB; + break; + case MODE_SELECT: + size = cdb[4]; + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_SG_IO_CDB; + break; + case MODE_SELECT_10: + size = (cdb[7] << 8) + cdb[8]; + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_SG_IO_CDB; + break; + case MODE_SENSE: + size = cdb[4]; + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_NONSG_IO_CDB; + break; + case MODE_SENSE_10: + case GPCMD_READ_BUFFER_CAPACITY: + case GPCMD_SEND_OPC: + case LOG_SELECT: + case LOG_SENSE: + size = (cdb[7] << 8) + cdb[8]; + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_NONSG_IO_CDB; + break; + case READ_BLOCK_LIMITS: + size = READ_BLOCK_LEN; + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_NONSG_IO_CDB; + break; + case GPCMD_GET_CONFIGURATION: + case GPCMD_READ_FORMAT_CAPACITIES: + case GPCMD_READ_DISC_INFO: + case GPCMD_READ_TRACK_RZONE_INFO: + size = (cdb[7] << 8) + cdb[8]; + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_SG_IO_CDB; + break; + case PERSISTENT_RESERVE_IN: + case PERSISTENT_RESERVE_OUT: + cmd->transport_emulate_cdb = + (T10_RES(su_dev)->res_type == + SPC3_PERSISTENT_RESERVATIONS) ? + &core_scsi3_emulate_pr : NULL; + size = (cdb[7] << 8) + cdb[8]; + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_NONSG_IO_CDB; + break; + case GPCMD_MECHANISM_STATUS: + case GPCMD_READ_DVD_STRUCTURE: + size = (cdb[8] << 8) + cdb[9]; + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_SG_IO_CDB; + break; + case READ_POSITION: + size = READ_POSITION_LEN; + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_NONSG_IO_CDB; + break; + case 0xa4: + if (TRANSPORT(dev)->get_device_type(dev) != TYPE_ROM) { + /* MAINTENANCE_OUT from SCC-2 + * + * Check for emulated MO_SET_TARGET_PGS. + */ + if (cdb[1] == MO_SET_TARGET_PGS) { + cmd->transport_emulate_cdb = + (T10_ALUA(su_dev)->alua_type == + SPC3_ALUA_EMULATED) ? + &core_emulate_set_target_port_groups : + NULL; + } + + size = (cdb[6] << 24) | (cdb[7] << 16) | + (cdb[8] << 8) | cdb[9]; + } else { + /* GPCMD_REPORT_KEY from multi media commands */ + size = (cdb[8] << 8) + cdb[9]; + } + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_NONSG_IO_CDB; + break; + case INQUIRY: + size = (cdb[3] << 8) + cdb[4]; + /* + * Do implict HEAD_OF_QUEUE processing for INQUIRY. + * See spc4r17 section 5.3 + */ + if (SE_DEV(cmd)->dev_task_attr_type == SAM_TASK_ATTR_EMULATED) + cmd->sam_task_attr = TASK_ATTR_HOQ; + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_NONSG_IO_CDB; + break; + case READ_BUFFER: + size = (cdb[6] << 16) + (cdb[7] << 8) + cdb[8]; + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_NONSG_IO_CDB; + break; + case READ_CAPACITY: + size = READ_CAP_LEN; + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_NONSG_IO_CDB; + break; + case READ_MEDIA_SERIAL_NUMBER: + case SECURITY_PROTOCOL_IN: + case SECURITY_PROTOCOL_OUT: + size = (cdb[6] << 24) | (cdb[7] << 16) | (cdb[8] << 8) | cdb[9]; + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_NONSG_IO_CDB; + break; + case SERVICE_ACTION_IN: + case ACCESS_CONTROL_IN: + case ACCESS_CONTROL_OUT: + case EXTENDED_COPY: + case READ_ATTRIBUTE: + case RECEIVE_COPY_RESULTS: + case WRITE_ATTRIBUTE: + size = (cdb[10] << 24) | (cdb[11] << 16) | + (cdb[12] << 8) | cdb[13]; + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_NONSG_IO_CDB; + break; + case RECEIVE_DIAGNOSTIC: + case SEND_DIAGNOSTIC: + size = (cdb[3] << 8) | cdb[4]; + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_NONSG_IO_CDB; + break; +/* #warning FIXME: Figure out correct GPCMD_READ_CD blocksize. */ +#if 0 + case GPCMD_READ_CD: + sectors = (cdb[6] << 16) + (cdb[7] << 8) + cdb[8]; + size = (2336 * sectors); + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_NONSG_IO_CDB; + break; +#endif + case READ_TOC: + size = cdb[8]; + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_NONSG_IO_CDB; + break; + case REQUEST_SENSE: + size = cdb[4]; + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_NONSG_IO_CDB; + break; + case READ_ELEMENT_STATUS: + size = 65536 * cdb[7] + 256 * cdb[8] + cdb[9]; + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_NONSG_IO_CDB; + break; + case WRITE_BUFFER: + size = (cdb[6] << 16) + (cdb[7] << 8) + cdb[8]; + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_NONSG_IO_CDB; + break; + case RESERVE: + case RESERVE_10: + /* + * The SPC-2 RESERVE does not contain a size in the SCSI CDB. + * Assume the passthrough or $FABRIC_MOD will tell us about it. + */ + if (cdb[0] == RESERVE_10) + size = (cdb[7] << 8) | cdb[8]; + else + size = cmd->data_length; + + /* + * Setup the legacy emulated handler for SPC-2 and + * >= SPC-3 compatible reservation handling (CRH=1) + * Otherwise, we assume the underlying SCSI logic is + * is running in SPC_PASSTHROUGH, and wants reservations + * emulation disabled. + */ + cmd->transport_emulate_cdb = + (T10_RES(su_dev)->res_type != + SPC_PASSTHROUGH) ? + &core_scsi2_emulate_crh : NULL; + cmd->se_cmd_flags |= SCF_SCSI_NON_DATA_CDB; + break; + case RELEASE: + case RELEASE_10: + /* + * The SPC-2 RELEASE does not contain a size in the SCSI CDB. + * Assume the passthrough or $FABRIC_MOD will tell us about it. + */ + if (cdb[0] == RELEASE_10) + size = (cdb[7] << 8) | cdb[8]; + else + size = cmd->data_length; + + cmd->transport_emulate_cdb = + (T10_RES(su_dev)->res_type != + SPC_PASSTHROUGH) ? + &core_scsi2_emulate_crh : NULL; + cmd->se_cmd_flags |= SCF_SCSI_NON_DATA_CDB; + break; + case SYNCHRONIZE_CACHE: + case 0x91: /* SYNCHRONIZE_CACHE_16: */ + /* + * Extract LBA and range to be flushed for emulated SYNCHRONIZE_CACHE + */ + if (cdb[0] == SYNCHRONIZE_CACHE) { + sectors = transport_get_sectors_10(cdb, cmd, §or_ret); + T_TASK(cmd)->t_task_lba = transport_lba_32(cdb); + } else { + sectors = transport_get_sectors_16(cdb, cmd, §or_ret); + T_TASK(cmd)->t_task_lba = transport_lba_64(cdb); + } + if (sector_ret) + goto out_unsupported_cdb; + + size = transport_get_size(sectors, cdb, cmd); + cmd->se_cmd_flags |= SCF_SCSI_NON_DATA_CDB; + + /* + * For TCM/pSCSI passthrough, skip cmd->transport_emulate_cdb() + */ + if (TRANSPORT(dev)->transport_type == TRANSPORT_PLUGIN_PHBA_PDEV) + break; + /* + * Set SCF_EMULATE_CDB_ASYNC to ensure asynchronous operation + * for SYNCHRONIZE_CACHE* Immed=1 case in __transport_execute_tasks() + */ + cmd->se_cmd_flags |= SCF_EMULATE_CDB_ASYNC; + /* + * Check to ensure that LBA + Range does not exceed past end of + * device. + */ + if (transport_get_sectors(cmd) < 0) + goto out_invalid_cdb_field; + break; + case UNMAP: + size = get_unaligned_be16(&cdb[7]); + passthrough = (TRANSPORT(dev)->transport_type == + TRANSPORT_PLUGIN_PHBA_PDEV); + /* + * Determine if the received UNMAP used to for direct passthrough + * into Linux/SCSI with struct request via TCM/pSCSI or we are + * signaling the use of internal transport_generic_unmap() emulation + * for UNMAP -> Linux/BLOCK disbard with TCM/IBLOCK and TCM/FILEIO + * subsystem plugin backstores. + */ + if (!(passthrough)) + cmd->se_cmd_flags |= SCF_EMULATE_SYNC_UNMAP; + + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_NONSG_IO_CDB; + break; + case WRITE_SAME_16: + sectors = transport_get_sectors_16(cdb, cmd, §or_ret); + if (sector_ret) + goto out_unsupported_cdb; + size = transport_get_size(sectors, cdb, cmd); + T_TASK(cmd)->t_task_lba = get_unaligned_be16(&cdb[2]); + passthrough = (TRANSPORT(dev)->transport_type == + TRANSPORT_PLUGIN_PHBA_PDEV); + /* + * Determine if the received WRITE_SAME_16 is used to for direct + * passthrough into Linux/SCSI with struct request via TCM/pSCSI + * or we are signaling the use of internal WRITE_SAME + UNMAP=1 + * emulation for -> Linux/BLOCK disbard with TCM/IBLOCK and + * TCM/FILEIO subsystem plugin backstores. + */ + if (!(passthrough)) { + if ((cdb[1] & 0x04) || (cdb[1] & 0x02)) { + printk(KERN_ERR "WRITE_SAME PBDATA and LBDATA" + " bits not supported for Block Discard" + " Emulation\n"); + goto out_invalid_cdb_field; + } + /* + * Currently for the emulated case we only accept + * tpws with the UNMAP=1 bit set. + */ + if (!(cdb[1] & 0x08)) { + printk(KERN_ERR "WRITE_SAME w/o UNMAP bit not " + " supported for Block Discard Emulation\n"); + goto out_invalid_cdb_field; + } + } + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_SG_IO_CDB; + break; + case ALLOW_MEDIUM_REMOVAL: + case GPCMD_CLOSE_TRACK: + case ERASE: + case INITIALIZE_ELEMENT_STATUS: + case GPCMD_LOAD_UNLOAD: + case REZERO_UNIT: + case SEEK_10: + case GPCMD_SET_SPEED: + case SPACE: + case START_STOP: + case TEST_UNIT_READY: + case VERIFY: + case WRITE_FILEMARKS: + case MOVE_MEDIUM: + cmd->se_cmd_flags |= SCF_SCSI_NON_DATA_CDB; + break; + case REPORT_LUNS: + cmd->transport_emulate_cdb = + &transport_core_report_lun_response; + size = (cdb[6] << 24) | (cdb[7] << 16) | (cdb[8] << 8) | cdb[9]; + /* + * Do implict HEAD_OF_QUEUE processing for REPORT_LUNS + * See spc4r17 section 5.3 + */ + if (SE_DEV(cmd)->dev_task_attr_type == SAM_TASK_ATTR_EMULATED) + cmd->sam_task_attr = TASK_ATTR_HOQ; + cmd->se_cmd_flags |= SCF_SCSI_CONTROL_NONSG_IO_CDB; + break; + default: + printk(KERN_WARNING "TARGET_CORE[%s]: Unsupported SCSI Opcode" + " 0x%02x, sending CHECK_CONDITION.\n", + CMD_TFO(cmd)->get_fabric_name(), cdb[0]); + cmd->transport_wait_for_tasks = &transport_nop_wait_for_tasks; + goto out_unsupported_cdb; + } + + if (size != cmd->data_length) { + printk(KERN_WARNING "TARGET_CORE[%s]: Expected Transfer Length:" + " %u does not match SCSI CDB Length: %u for SAM Opcode:" + " 0x%02x\n", CMD_TFO(cmd)->get_fabric_name(), + cmd->data_length, size, cdb[0]); + + cmd->cmd_spdtl = size; + + if (cmd->data_direction == DMA_TO_DEVICE) { + printk(KERN_ERR "Rejecting underflow/overflow" + " WRITE data\n"); + goto out_invalid_cdb_field; + } + /* + * Reject READ_* or WRITE_* with overflow/underflow for + * type SCF_SCSI_DATA_SG_IO_CDB. + */ + if (!(ret) && (DEV_ATTRIB(dev)->block_size != 512)) { + printk(KERN_ERR "Failing OVERFLOW/UNDERFLOW for LBA op" + " CDB on non 512-byte sector setup subsystem" + " plugin: %s\n", TRANSPORT(dev)->name); + /* Returns CHECK_CONDITION + INVALID_CDB_FIELD */ + goto out_invalid_cdb_field; + } + + if (size > cmd->data_length) { + cmd->se_cmd_flags |= SCF_OVERFLOW_BIT; + cmd->residual_count = (size - cmd->data_length); + } else { + cmd->se_cmd_flags |= SCF_UNDERFLOW_BIT; + cmd->residual_count = (cmd->data_length - size); + } + cmd->data_length = size; + } + + transport_set_supported_SAM_opcode(cmd); + return ret; + +out_unsupported_cdb: + cmd->se_cmd_flags |= SCF_SCSI_CDB_EXCEPTION; + cmd->scsi_sense_reason = TCM_UNSUPPORTED_SCSI_OPCODE; + return -2; +out_invalid_cdb_field: + cmd->se_cmd_flags |= SCF_SCSI_CDB_EXCEPTION; + cmd->scsi_sense_reason = TCM_INVALID_CDB_FIELD; + return -2; +} + +static inline void transport_release_tasks(struct se_cmd *); + +/* + * This function will copy a contiguous *src buffer into a destination + * struct scatterlist array. + */ +static void transport_memcpy_write_contig( + struct se_cmd *cmd, + struct scatterlist *sg_d, + unsigned char *src) +{ + u32 i = 0, length = 0, total_length = cmd->data_length; + void *dst; + + while (total_length) { + length = sg_d[i].length; + + if (length > total_length) + length = total_length; + + dst = sg_virt(&sg_d[i]); + + memcpy(dst, src, length); + + if (!(total_length -= length)) + return; + + src += length; + i++; + } +} + +/* + * This function will copy a struct scatterlist array *sg_s into a destination + * contiguous *dst buffer. + */ +static void transport_memcpy_read_contig( + struct se_cmd *cmd, + unsigned char *dst, + struct scatterlist *sg_s) +{ + u32 i = 0, length = 0, total_length = cmd->data_length; + void *src; + + while (total_length) { + length = sg_s[i].length; + + if (length > total_length) + length = total_length; + + src = sg_virt(&sg_s[i]); + + memcpy(dst, src, length); + + if (!(total_length -= length)) + return; + + dst += length; + i++; + } +} + +static void transport_memcpy_se_mem_read_contig( + struct se_cmd *cmd, + unsigned char *dst, + struct list_head *se_mem_list) +{ + struct se_mem *se_mem; + void *src; + u32 length = 0, total_length = cmd->data_length; + + list_for_each_entry(se_mem, se_mem_list, se_list) { + length = se_mem->se_len; + + if (length > total_length) + length = total_length; + + src = page_address(se_mem->se_page) + se_mem->se_off; + + memcpy(dst, src, length); + + if (!(total_length -= length)) + return; + + dst += length; + } +} + +/* + * Called from transport_generic_complete_ok() and + * transport_generic_request_failure() to determine which dormant/delayed + * and ordered cmds need to have their tasks added to the execution queue. + */ +static void transport_complete_task_attr(struct se_cmd *cmd) +{ + struct se_device *dev = SE_DEV(cmd); + struct se_cmd *cmd_p, *cmd_tmp; + int new_active_tasks = 0; + + if (cmd->sam_task_attr == TASK_ATTR_SIMPLE) { + atomic_dec(&dev->simple_cmds); + smp_mb__after_atomic_dec(); + dev->dev_cur_ordered_id++; + DEBUG_STA("Incremented dev->dev_cur_ordered_id: %u for" + " SIMPLE: %u\n", dev->dev_cur_ordered_id, + cmd->se_ordered_id); + } else if (cmd->sam_task_attr == TASK_ATTR_HOQ) { + atomic_dec(&dev->dev_hoq_count); + smp_mb__after_atomic_dec(); + dev->dev_cur_ordered_id++; + DEBUG_STA("Incremented dev_cur_ordered_id: %u for" + " HEAD_OF_QUEUE: %u\n", dev->dev_cur_ordered_id, + cmd->se_ordered_id); + } else if (cmd->sam_task_attr == TASK_ATTR_ORDERED) { + spin_lock(&dev->ordered_cmd_lock); + list_del(&cmd->se_ordered_list); + atomic_dec(&dev->dev_ordered_sync); + smp_mb__after_atomic_dec(); + spin_unlock(&dev->ordered_cmd_lock); + + dev->dev_cur_ordered_id++; + DEBUG_STA("Incremented dev_cur_ordered_id: %u for ORDERED:" + " %u\n", dev->dev_cur_ordered_id, cmd->se_ordered_id); + } + /* + * Process all commands up to the last received + * ORDERED task attribute which requires another blocking + * boundary + */ + spin_lock(&dev->delayed_cmd_lock); + list_for_each_entry_safe(cmd_p, cmd_tmp, + &dev->delayed_cmd_list, se_delayed_list) { + + list_del(&cmd_p->se_delayed_list); + spin_unlock(&dev->delayed_cmd_lock); + + DEBUG_STA("Calling add_tasks() for" + " cmd_p: 0x%02x Task Attr: 0x%02x" + " Dormant -> Active, se_ordered_id: %u\n", + T_TASK(cmd_p)->t_task_cdb[0], + cmd_p->sam_task_attr, cmd_p->se_ordered_id); + + transport_add_tasks_from_cmd(cmd_p); + new_active_tasks++; + + spin_lock(&dev->delayed_cmd_lock); + if (cmd_p->sam_task_attr == TASK_ATTR_ORDERED) + break; + } + spin_unlock(&dev->delayed_cmd_lock); + /* + * If new tasks have become active, wake up the transport thread + * to do the processing of the Active tasks. + */ + if (new_active_tasks != 0) + wake_up_interruptible(&dev->dev_queue_obj->thread_wq); +} + +static void transport_generic_complete_ok(struct se_cmd *cmd) +{ + int reason = 0; + /* + * Check if we need to move delayed/dormant tasks from cmds on the + * delayed execution list after a HEAD_OF_QUEUE or ORDERED Task + * Attribute. + */ + if (SE_DEV(cmd)->dev_task_attr_type == SAM_TASK_ATTR_EMULATED) + transport_complete_task_attr(cmd); + /* + * Check if we need to retrieve a sense buffer from + * the struct se_cmd in question. + */ + if (cmd->se_cmd_flags & SCF_TRANSPORT_TASK_SENSE) { + if (transport_get_sense_data(cmd) < 0) + reason = TCM_NON_EXISTENT_LUN; + + /* + * Only set when an struct se_task->task_scsi_status returned + * a non GOOD status. + */ + if (cmd->scsi_status) { + transport_send_check_condition_and_sense( + cmd, reason, 1); + transport_lun_remove_cmd(cmd); + transport_cmd_check_stop_to_fabric(cmd); + return; + } + } + /* + * Check for a callback, used by amoungst other things + * XDWRITE_READ_10 emulation. + */ + if (cmd->transport_complete_callback) + cmd->transport_complete_callback(cmd); + + switch (cmd->data_direction) { + case DMA_FROM_DEVICE: + spin_lock(&cmd->se_lun->lun_sep_lock); + if (SE_LUN(cmd)->lun_sep) { + SE_LUN(cmd)->lun_sep->sep_stats.tx_data_octets += + cmd->data_length; + } + spin_unlock(&cmd->se_lun->lun_sep_lock); + /* + * If enabled by TCM fabirc module pre-registered SGL + * memory, perform the memcpy() from the TCM internal + * contigious buffer back to the original SGL. + */ + if (cmd->se_cmd_flags & SCF_PASSTHROUGH_CONTIG_TO_SG) + transport_memcpy_write_contig(cmd, + T_TASK(cmd)->t_task_pt_sgl, + T_TASK(cmd)->t_task_buf); + + CMD_TFO(cmd)->queue_data_in(cmd); + break; + case DMA_TO_DEVICE: + spin_lock(&cmd->se_lun->lun_sep_lock); + if (SE_LUN(cmd)->lun_sep) { + SE_LUN(cmd)->lun_sep->sep_stats.rx_data_octets += + cmd->data_length; + } + spin_unlock(&cmd->se_lun->lun_sep_lock); + /* + * Check if we need to send READ payload for BIDI-COMMAND + */ + if (T_TASK(cmd)->t_mem_bidi_list != NULL) { + spin_lock(&cmd->se_lun->lun_sep_lock); + if (SE_LUN(cmd)->lun_sep) { + SE_LUN(cmd)->lun_sep->sep_stats.tx_data_octets += + cmd->data_length; + } + spin_unlock(&cmd->se_lun->lun_sep_lock); + CMD_TFO(cmd)->queue_data_in(cmd); + break; + } + /* Fall through for DMA_TO_DEVICE */ + case DMA_NONE: + CMD_TFO(cmd)->queue_status(cmd); + break; + default: + break; + } + + transport_lun_remove_cmd(cmd); + transport_cmd_check_stop_to_fabric(cmd); +} + +static void transport_free_dev_tasks(struct se_cmd *cmd) +{ + struct se_task *task, *task_tmp; + unsigned long flags; + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + list_for_each_entry_safe(task, task_tmp, + &T_TASK(cmd)->t_task_list, t_list) { + if (atomic_read(&task->task_active)) + continue; + + kfree(task->task_sg_bidi); + kfree(task->task_sg); + + list_del(&task->t_list); + + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + if (task->se_dev) + TRANSPORT(task->se_dev)->free_task(task); + else + printk(KERN_ERR "task[%u] - task->se_dev is NULL\n", + task->task_no); + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + } + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); +} + +static inline void transport_free_pages(struct se_cmd *cmd) +{ + struct se_mem *se_mem, *se_mem_tmp; + int free_page = 1; + + if (cmd->se_cmd_flags & SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC) + free_page = 0; + if (cmd->se_dev->transport->do_se_mem_map) + free_page = 0; + + if (T_TASK(cmd)->t_task_buf) { + kfree(T_TASK(cmd)->t_task_buf); + T_TASK(cmd)->t_task_buf = NULL; + return; + } + + /* + * Caller will handle releasing of struct se_mem. + */ + if (cmd->se_cmd_flags & SCF_CMD_PASSTHROUGH_NOALLOC) + return; + + if (!(T_TASK(cmd)->t_tasks_se_num)) + return; + + list_for_each_entry_safe(se_mem, se_mem_tmp, + T_TASK(cmd)->t_mem_list, se_list) { + /* + * We only release call __free_page(struct se_mem->se_page) when + * SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC is NOT in use, + */ + if (free_page) + __free_page(se_mem->se_page); + + list_del(&se_mem->se_list); + kmem_cache_free(se_mem_cache, se_mem); + } + + if (T_TASK(cmd)->t_mem_bidi_list && T_TASK(cmd)->t_tasks_se_bidi_num) { + list_for_each_entry_safe(se_mem, se_mem_tmp, + T_TASK(cmd)->t_mem_bidi_list, se_list) { + /* + * We only release call __free_page(struct se_mem->se_page) when + * SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC is NOT in use, + */ + if (free_page) + __free_page(se_mem->se_page); + + list_del(&se_mem->se_list); + kmem_cache_free(se_mem_cache, se_mem); + } + } + + kfree(T_TASK(cmd)->t_mem_bidi_list); + T_TASK(cmd)->t_mem_bidi_list = NULL; + kfree(T_TASK(cmd)->t_mem_list); + T_TASK(cmd)->t_mem_list = NULL; + T_TASK(cmd)->t_tasks_se_num = 0; +} + +static inline void transport_release_tasks(struct se_cmd *cmd) +{ + transport_free_dev_tasks(cmd); +} + +static inline int transport_dec_and_check(struct se_cmd *cmd) +{ + unsigned long flags; + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + if (atomic_read(&T_TASK(cmd)->t_fe_count)) { + if (!(atomic_dec_and_test(&T_TASK(cmd)->t_fe_count))) { + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, + flags); + return 1; + } + } + + if (atomic_read(&T_TASK(cmd)->t_se_count)) { + if (!(atomic_dec_and_test(&T_TASK(cmd)->t_se_count))) { + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, + flags); + return 1; + } + } + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + + return 0; +} + +static void transport_release_fe_cmd(struct se_cmd *cmd) +{ + unsigned long flags; + + if (transport_dec_and_check(cmd)) + return; + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + if (!(atomic_read(&T_TASK(cmd)->transport_dev_active))) { + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + goto free_pages; + } + atomic_set(&T_TASK(cmd)->transport_dev_active, 0); + transport_all_task_dev_remove_state(cmd); + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + + transport_release_tasks(cmd); +free_pages: + transport_free_pages(cmd); + transport_free_se_cmd(cmd); + CMD_TFO(cmd)->release_cmd_direct(cmd); +} + +static int transport_generic_remove( + struct se_cmd *cmd, + int release_to_pool, + int session_reinstatement) +{ + unsigned long flags; + + if (!(T_TASK(cmd))) + goto release_cmd; + + if (transport_dec_and_check(cmd)) { + if (session_reinstatement) { + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + transport_all_task_dev_remove_state(cmd); + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, + flags); + } + return 1; + } + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + if (!(atomic_read(&T_TASK(cmd)->transport_dev_active))) { + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + goto free_pages; + } + atomic_set(&T_TASK(cmd)->transport_dev_active, 0); + transport_all_task_dev_remove_state(cmd); + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + + transport_release_tasks(cmd); +free_pages: + transport_free_pages(cmd); + +release_cmd: + if (release_to_pool) { + transport_release_cmd_to_pool(cmd); + } else { + transport_free_se_cmd(cmd); + CMD_TFO(cmd)->release_cmd_direct(cmd); + } + + return 0; +} + +/* + * transport_generic_map_mem_to_cmd - Perform SGL -> struct se_mem map + * @cmd: Associated se_cmd descriptor + * @mem: SGL style memory for TCM WRITE / READ + * @sg_mem_num: Number of SGL elements + * @mem_bidi_in: SGL style memory for TCM BIDI READ + * @sg_mem_bidi_num: Number of BIDI READ SGL elements + * + * Return: nonzero return cmd was rejected for -ENOMEM or inproper usage + * of parameters. + */ +int transport_generic_map_mem_to_cmd( + struct se_cmd *cmd, + struct scatterlist *mem, + u32 sg_mem_num, + struct scatterlist *mem_bidi_in, + u32 sg_mem_bidi_num) +{ + u32 se_mem_cnt_out = 0; + int ret; + + if (!(mem) || !(sg_mem_num)) + return 0; + /* + * Passed *mem will contain a list_head containing preformatted + * struct se_mem elements... + */ + if (!(cmd->se_cmd_flags & SCF_PASSTHROUGH_SG_TO_MEM)) { + if ((mem_bidi_in) || (sg_mem_bidi_num)) { + printk(KERN_ERR "SCF_CMD_PASSTHROUGH_NOALLOC not supported" + " with BIDI-COMMAND\n"); + return -ENOSYS; + } + + T_TASK(cmd)->t_mem_list = (struct list_head *)mem; + T_TASK(cmd)->t_tasks_se_num = sg_mem_num; + cmd->se_cmd_flags |= SCF_CMD_PASSTHROUGH_NOALLOC; + return 0; + } + /* + * Otherwise, assume the caller is passing a struct scatterlist + * array from include/linux/scatterlist.h + */ + if ((cmd->se_cmd_flags & SCF_SCSI_DATA_SG_IO_CDB) || + (cmd->se_cmd_flags & SCF_SCSI_CONTROL_SG_IO_CDB)) { + /* + * For CDB using TCM struct se_mem linked list scatterlist memory + * processed into a TCM struct se_subsystem_dev, we do the mapping + * from the passed physical memory to struct se_mem->se_page here. + */ + T_TASK(cmd)->t_mem_list = transport_init_se_mem_list(); + if (!(T_TASK(cmd)->t_mem_list)) + return -ENOMEM; + + ret = transport_map_sg_to_mem(cmd, + T_TASK(cmd)->t_mem_list, mem, &se_mem_cnt_out); + if (ret < 0) + return -ENOMEM; + + T_TASK(cmd)->t_tasks_se_num = se_mem_cnt_out; + /* + * Setup BIDI READ list of struct se_mem elements + */ + if ((mem_bidi_in) && (sg_mem_bidi_num)) { + T_TASK(cmd)->t_mem_bidi_list = transport_init_se_mem_list(); + if (!(T_TASK(cmd)->t_mem_bidi_list)) { + kfree(T_TASK(cmd)->t_mem_list); + return -ENOMEM; + } + se_mem_cnt_out = 0; + + ret = transport_map_sg_to_mem(cmd, + T_TASK(cmd)->t_mem_bidi_list, mem_bidi_in, + &se_mem_cnt_out); + if (ret < 0) { + kfree(T_TASK(cmd)->t_mem_list); + return -ENOMEM; + } + + T_TASK(cmd)->t_tasks_se_bidi_num = se_mem_cnt_out; + } + cmd->se_cmd_flags |= SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC; + + } else if (cmd->se_cmd_flags & SCF_SCSI_CONTROL_NONSG_IO_CDB) { + if (mem_bidi_in || sg_mem_bidi_num) { + printk(KERN_ERR "BIDI-Commands not supported using " + "SCF_SCSI_CONTROL_NONSG_IO_CDB\n"); + return -ENOSYS; + } + /* + * For incoming CDBs using a contiguous buffer internall with TCM, + * save the passed struct scatterlist memory. After TCM storage object + * processing has completed for this struct se_cmd, TCM core will call + * transport_memcpy_[write,read]_contig() as necessary from + * transport_generic_complete_ok() and transport_write_pending() in order + * to copy the TCM buffer to/from the original passed *mem in SGL -> + * struct scatterlist format. + */ + cmd->se_cmd_flags |= SCF_PASSTHROUGH_CONTIG_TO_SG; + T_TASK(cmd)->t_task_pt_sgl = mem; + } + + return 0; +} +EXPORT_SYMBOL(transport_generic_map_mem_to_cmd); + + +static inline long long transport_dev_end_lba(struct se_device *dev) +{ + return dev->transport->get_blocks(dev) + 1; +} + +static int transport_get_sectors(struct se_cmd *cmd) +{ + struct se_device *dev = SE_DEV(cmd); + + T_TASK(cmd)->t_tasks_sectors = + (cmd->data_length / DEV_ATTRIB(dev)->block_size); + if (!(T_TASK(cmd)->t_tasks_sectors)) + T_TASK(cmd)->t_tasks_sectors = 1; + + if (TRANSPORT(dev)->get_device_type(dev) != TYPE_DISK) + return 0; + + if ((T_TASK(cmd)->t_task_lba + T_TASK(cmd)->t_tasks_sectors) > + transport_dev_end_lba(dev)) { + printk(KERN_ERR "LBA: %llu Sectors: %u exceeds" + " transport_dev_end_lba(): %llu\n", + T_TASK(cmd)->t_task_lba, T_TASK(cmd)->t_tasks_sectors, + transport_dev_end_lba(dev)); + cmd->se_cmd_flags |= SCF_SCSI_CDB_EXCEPTION; + cmd->scsi_sense_reason = TCM_SECTOR_COUNT_TOO_MANY; + return PYX_TRANSPORT_REQ_TOO_MANY_SECTORS; + } + + return 0; +} + +static int transport_new_cmd_obj(struct se_cmd *cmd) +{ + struct se_device *dev = SE_DEV(cmd); + u32 task_cdbs = 0, rc; + + if (!(cmd->se_cmd_flags & SCF_SCSI_DATA_SG_IO_CDB)) { + task_cdbs++; + T_TASK(cmd)->t_task_cdbs++; + } else { + int set_counts = 1; + + /* + * Setup any BIDI READ tasks and memory from + * T_TASK(cmd)->t_mem_bidi_list so the READ struct se_tasks + * are queued first for the non pSCSI passthrough case. + */ + if ((T_TASK(cmd)->t_mem_bidi_list != NULL) && + (TRANSPORT(dev)->transport_type != TRANSPORT_PLUGIN_PHBA_PDEV)) { + rc = transport_generic_get_cdb_count(cmd, + T_TASK(cmd)->t_task_lba, + T_TASK(cmd)->t_tasks_sectors, + DMA_FROM_DEVICE, T_TASK(cmd)->t_mem_bidi_list, + set_counts); + if (!(rc)) { + cmd->se_cmd_flags |= SCF_SCSI_CDB_EXCEPTION; + cmd->scsi_sense_reason = + TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE; + return PYX_TRANSPORT_LU_COMM_FAILURE; + } + set_counts = 0; + } + /* + * Setup the tasks and memory from T_TASK(cmd)->t_mem_list + * Note for BIDI transfers this will contain the WRITE payload + */ + task_cdbs = transport_generic_get_cdb_count(cmd, + T_TASK(cmd)->t_task_lba, + T_TASK(cmd)->t_tasks_sectors, + cmd->data_direction, T_TASK(cmd)->t_mem_list, + set_counts); + if (!(task_cdbs)) { + cmd->se_cmd_flags |= SCF_SCSI_CDB_EXCEPTION; + cmd->scsi_sense_reason = + TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE; + return PYX_TRANSPORT_LU_COMM_FAILURE; + } + T_TASK(cmd)->t_task_cdbs += task_cdbs; + +#if 0 + printk(KERN_INFO "data_length: %u, LBA: %llu t_tasks_sectors:" + " %u, t_task_cdbs: %u\n", obj_ptr, cmd->data_length, + T_TASK(cmd)->t_task_lba, T_TASK(cmd)->t_tasks_sectors, + T_TASK(cmd)->t_task_cdbs); +#endif + } + + atomic_set(&T_TASK(cmd)->t_task_cdbs_left, task_cdbs); + atomic_set(&T_TASK(cmd)->t_task_cdbs_ex_left, task_cdbs); + atomic_set(&T_TASK(cmd)->t_task_cdbs_timeout_left, task_cdbs); + return 0; +} + +static struct list_head *transport_init_se_mem_list(void) +{ + struct list_head *se_mem_list; + + se_mem_list = kzalloc(sizeof(struct list_head), GFP_KERNEL); + if (!(se_mem_list)) { + printk(KERN_ERR "Unable to allocate memory for se_mem_list\n"); + return NULL; + } + INIT_LIST_HEAD(se_mem_list); + + return se_mem_list; +} + +static int +transport_generic_get_mem(struct se_cmd *cmd, u32 length, u32 dma_size) +{ + unsigned char *buf; + struct se_mem *se_mem; + + T_TASK(cmd)->t_mem_list = transport_init_se_mem_list(); + if (!(T_TASK(cmd)->t_mem_list)) + return -ENOMEM; + + /* + * If the device uses memory mapping this is enough. + */ + if (cmd->se_dev->transport->do_se_mem_map) + return 0; + + /* + * Setup BIDI-COMMAND READ list of struct se_mem elements + */ + if (T_TASK(cmd)->t_tasks_bidi) { + T_TASK(cmd)->t_mem_bidi_list = transport_init_se_mem_list(); + if (!(T_TASK(cmd)->t_mem_bidi_list)) { + kfree(T_TASK(cmd)->t_mem_list); + return -ENOMEM; + } + } + + while (length) { + se_mem = kmem_cache_zalloc(se_mem_cache, GFP_KERNEL); + if (!(se_mem)) { + printk(KERN_ERR "Unable to allocate struct se_mem\n"); + goto out; + } + INIT_LIST_HEAD(&se_mem->se_list); + se_mem->se_len = (length > dma_size) ? dma_size : length; + +/* #warning FIXME Allocate contigous pages for struct se_mem elements */ + se_mem->se_page = (struct page *) alloc_pages(GFP_KERNEL, 0); + if (!(se_mem->se_page)) { + printk(KERN_ERR "alloc_pages() failed\n"); + goto out; + } + + buf = kmap_atomic(se_mem->se_page, KM_IRQ0); + if (!(buf)) { + printk(KERN_ERR "kmap_atomic() failed\n"); + goto out; + } + memset(buf, 0, se_mem->se_len); + kunmap_atomic(buf, KM_IRQ0); + + list_add_tail(&se_mem->se_list, T_TASK(cmd)->t_mem_list); + T_TASK(cmd)->t_tasks_se_num++; + + DEBUG_MEM("Allocated struct se_mem page(%p) Length(%u)" + " Offset(%u)\n", se_mem->se_page, se_mem->se_len, + se_mem->se_off); + + length -= se_mem->se_len; + } + + DEBUG_MEM("Allocated total struct se_mem elements(%u)\n", + T_TASK(cmd)->t_tasks_se_num); + + return 0; +out: + return -1; +} + +extern u32 transport_calc_sg_num( + struct se_task *task, + struct se_mem *in_se_mem, + u32 task_offset) +{ + struct se_cmd *se_cmd = task->task_se_cmd; + struct se_device *se_dev = SE_DEV(se_cmd); + struct se_mem *se_mem = in_se_mem; + struct target_core_fabric_ops *tfo = CMD_TFO(se_cmd); + u32 sg_length, task_size = task->task_size, task_sg_num_padded; + + while (task_size != 0) { + DEBUG_SC("se_mem->se_page(%p) se_mem->se_len(%u)" + " se_mem->se_off(%u) task_offset(%u)\n", + se_mem->se_page, se_mem->se_len, + se_mem->se_off, task_offset); + + if (task_offset == 0) { + if (task_size >= se_mem->se_len) { + sg_length = se_mem->se_len; + + if (!(list_is_last(&se_mem->se_list, + T_TASK(se_cmd)->t_mem_list))) + se_mem = list_entry(se_mem->se_list.next, + struct se_mem, se_list); + } else { + sg_length = task_size; + task_size -= sg_length; + goto next; + } + + DEBUG_SC("sg_length(%u) task_size(%u)\n", + sg_length, task_size); + } else { + if ((se_mem->se_len - task_offset) > task_size) { + sg_length = task_size; + task_size -= sg_length; + goto next; + } else { + sg_length = (se_mem->se_len - task_offset); + + if (!(list_is_last(&se_mem->se_list, + T_TASK(se_cmd)->t_mem_list))) + se_mem = list_entry(se_mem->se_list.next, + struct se_mem, se_list); + } + + DEBUG_SC("sg_length(%u) task_size(%u)\n", + sg_length, task_size); + + task_offset = 0; + } + task_size -= sg_length; +next: + DEBUG_SC("task[%u] - Reducing task_size to(%u)\n", + task->task_no, task_size); + + task->task_sg_num++; + } + /* + * Check if the fabric module driver is requesting that all + * struct se_task->task_sg[] be chained together.. If so, + * then allocate an extra padding SG entry for linking and + * marking the end of the chained SGL. + */ + if (tfo->task_sg_chaining) { + task_sg_num_padded = (task->task_sg_num + 1); + task->task_padded_sg = 1; + } else + task_sg_num_padded = task->task_sg_num; + + task->task_sg = kzalloc(task_sg_num_padded * + sizeof(struct scatterlist), GFP_KERNEL); + if (!(task->task_sg)) { + printk(KERN_ERR "Unable to allocate memory for" + " task->task_sg\n"); + return 0; + } + sg_init_table(&task->task_sg[0], task_sg_num_padded); + /* + * Setup task->task_sg_bidi for SCSI READ payload for + * TCM/pSCSI passthrough if present for BIDI-COMMAND + */ + if ((T_TASK(se_cmd)->t_mem_bidi_list != NULL) && + (TRANSPORT(se_dev)->transport_type == TRANSPORT_PLUGIN_PHBA_PDEV)) { + task->task_sg_bidi = kzalloc(task_sg_num_padded * + sizeof(struct scatterlist), GFP_KERNEL); + if (!(task->task_sg_bidi)) { + printk(KERN_ERR "Unable to allocate memory for" + " task->task_sg_bidi\n"); + return 0; + } + sg_init_table(&task->task_sg_bidi[0], task_sg_num_padded); + } + /* + * For the chaining case, setup the proper end of SGL for the + * initial submission struct task into struct se_subsystem_api. + * This will be cleared later by transport_do_task_sg_chain() + */ + if (task->task_padded_sg) { + sg_mark_end(&task->task_sg[task->task_sg_num - 1]); + /* + * Added the 'if' check before marking end of bi-directional + * scatterlist (which gets created only in case of request + * (RD + WR). + */ + if (task->task_sg_bidi) + sg_mark_end(&task->task_sg_bidi[task->task_sg_num - 1]); + } + + DEBUG_SC("Successfully allocated task->task_sg_num(%u)," + " task_sg_num_padded(%u)\n", task->task_sg_num, + task_sg_num_padded); + + return task->task_sg_num; +} + +static inline int transport_set_tasks_sectors_disk( + struct se_task *task, + struct se_device *dev, + unsigned long long lba, + u32 sectors, + int *max_sectors_set) +{ + if ((lba + sectors) > transport_dev_end_lba(dev)) { + task->task_sectors = ((transport_dev_end_lba(dev) - lba) + 1); + + if (task->task_sectors > DEV_ATTRIB(dev)->max_sectors) { + task->task_sectors = DEV_ATTRIB(dev)->max_sectors; + *max_sectors_set = 1; + } + } else { + if (sectors > DEV_ATTRIB(dev)->max_sectors) { + task->task_sectors = DEV_ATTRIB(dev)->max_sectors; + *max_sectors_set = 1; + } else + task->task_sectors = sectors; + } + + return 0; +} + +static inline int transport_set_tasks_sectors_non_disk( + struct se_task *task, + struct se_device *dev, + unsigned long long lba, + u32 sectors, + int *max_sectors_set) +{ + if (sectors > DEV_ATTRIB(dev)->max_sectors) { + task->task_sectors = DEV_ATTRIB(dev)->max_sectors; + *max_sectors_set = 1; + } else + task->task_sectors = sectors; + + return 0; +} + +static inline int transport_set_tasks_sectors( + struct se_task *task, + struct se_device *dev, + unsigned long long lba, + u32 sectors, + int *max_sectors_set) +{ + return (TRANSPORT(dev)->get_device_type(dev) == TYPE_DISK) ? + transport_set_tasks_sectors_disk(task, dev, lba, sectors, + max_sectors_set) : + transport_set_tasks_sectors_non_disk(task, dev, lba, sectors, + max_sectors_set); +} + +static int transport_map_sg_to_mem( + struct se_cmd *cmd, + struct list_head *se_mem_list, + void *in_mem, + u32 *se_mem_cnt) +{ + struct se_mem *se_mem; + struct scatterlist *sg; + u32 sg_count = 1, cmd_size = cmd->data_length; + + if (!in_mem) { + printk(KERN_ERR "No source scatterlist\n"); + return -1; + } + sg = (struct scatterlist *)in_mem; + + while (cmd_size) { + se_mem = kmem_cache_zalloc(se_mem_cache, GFP_KERNEL); + if (!(se_mem)) { + printk(KERN_ERR "Unable to allocate struct se_mem\n"); + return -1; + } + INIT_LIST_HEAD(&se_mem->se_list); + DEBUG_MEM("sg_to_mem: Starting loop with cmd_size: %u" + " sg_page: %p offset: %d length: %d\n", cmd_size, + sg_page(sg), sg->offset, sg->length); + + se_mem->se_page = sg_page(sg); + se_mem->se_off = sg->offset; + + if (cmd_size > sg->length) { + se_mem->se_len = sg->length; + sg = sg_next(sg); + sg_count++; + } else + se_mem->se_len = cmd_size; + + cmd_size -= se_mem->se_len; + + DEBUG_MEM("sg_to_mem: *se_mem_cnt: %u cmd_size: %u\n", + *se_mem_cnt, cmd_size); + DEBUG_MEM("sg_to_mem: Final se_page: %p se_off: %d se_len: %d\n", + se_mem->se_page, se_mem->se_off, se_mem->se_len); + + list_add_tail(&se_mem->se_list, se_mem_list); + (*se_mem_cnt)++; + } + + DEBUG_MEM("task[0] - Mapped(%u) struct scatterlist segments to(%u)" + " struct se_mem\n", sg_count, *se_mem_cnt); + + if (sg_count != *se_mem_cnt) + BUG(); + + return 0; +} + +/* transport_map_mem_to_sg(): + * + * + */ +int transport_map_mem_to_sg( + struct se_task *task, + struct list_head *se_mem_list, + void *in_mem, + struct se_mem *in_se_mem, + struct se_mem **out_se_mem, + u32 *se_mem_cnt, + u32 *task_offset) +{ + struct se_cmd *se_cmd = task->task_se_cmd; + struct se_mem *se_mem = in_se_mem; + struct scatterlist *sg = (struct scatterlist *)in_mem; + u32 task_size = task->task_size, sg_no = 0; + + if (!sg) { + printk(KERN_ERR "Unable to locate valid struct" + " scatterlist pointer\n"); + return -1; + } + + while (task_size != 0) { + /* + * Setup the contigious array of scatterlists for + * this struct se_task. + */ + sg_assign_page(sg, se_mem->se_page); + + if (*task_offset == 0) { + sg->offset = se_mem->se_off; + + if (task_size >= se_mem->se_len) { + sg->length = se_mem->se_len; + + if (!(list_is_last(&se_mem->se_list, + T_TASK(se_cmd)->t_mem_list))) { + se_mem = list_entry(se_mem->se_list.next, + struct se_mem, se_list); + (*se_mem_cnt)++; + } + } else { + sg->length = task_size; + /* + * Determine if we need to calculate an offset + * into the struct se_mem on the next go around.. + */ + task_size -= sg->length; + if (!(task_size)) + *task_offset = sg->length; + + goto next; + } + + } else { + sg->offset = (*task_offset + se_mem->se_off); + + if ((se_mem->se_len - *task_offset) > task_size) { + sg->length = task_size; + /* + * Determine if we need to calculate an offset + * into the struct se_mem on the next go around.. + */ + task_size -= sg->length; + if (!(task_size)) + *task_offset += sg->length; + + goto next; + } else { + sg->length = (se_mem->se_len - *task_offset); + + if (!(list_is_last(&se_mem->se_list, + T_TASK(se_cmd)->t_mem_list))) { + se_mem = list_entry(se_mem->se_list.next, + struct se_mem, se_list); + (*se_mem_cnt)++; + } + } + + *task_offset = 0; + } + task_size -= sg->length; +next: + DEBUG_MEM("task[%u] mem_to_sg - sg[%u](%p)(%u)(%u) - Reducing" + " task_size to(%u), task_offset: %u\n", task->task_no, sg_no, + sg_page(sg), sg->length, sg->offset, task_size, *task_offset); + + sg_no++; + if (!(task_size)) + break; + + sg = sg_next(sg); + + if (task_size > se_cmd->data_length) + BUG(); + } + *out_se_mem = se_mem; + + DEBUG_MEM("task[%u] - Mapped(%u) struct se_mem segments to total(%u)" + " SGs\n", task->task_no, *se_mem_cnt, sg_no); + + return 0; +} + +/* + * This function can be used by HW target mode drivers to create a linked + * scatterlist from all contiguously allocated struct se_task->task_sg[]. + * This is intended to be called during the completion path by TCM Core + * when struct target_core_fabric_ops->check_task_sg_chaining is enabled. + */ +void transport_do_task_sg_chain(struct se_cmd *cmd) +{ + struct scatterlist *sg_head = NULL, *sg_link = NULL, *sg_first = NULL; + struct scatterlist *sg_head_cur = NULL, *sg_link_cur = NULL; + struct scatterlist *sg, *sg_end = NULL, *sg_end_cur = NULL; + struct se_task *task; + struct target_core_fabric_ops *tfo = CMD_TFO(cmd); + u32 task_sg_num = 0, sg_count = 0; + int i; + + if (tfo->task_sg_chaining == 0) { + printk(KERN_ERR "task_sg_chaining is diabled for fabric module:" + " %s\n", tfo->get_fabric_name()); + dump_stack(); + return; + } + /* + * Walk the struct se_task list and setup scatterlist chains + * for each contiguosly allocated struct se_task->task_sg[]. + */ + list_for_each_entry(task, &T_TASK(cmd)->t_task_list, t_list) { + if (!(task->task_sg) || !(task->task_padded_sg)) + continue; + + if (sg_head && sg_link) { + sg_head_cur = &task->task_sg[0]; + sg_link_cur = &task->task_sg[task->task_sg_num]; + /* + * Either add chain or mark end of scatterlist + */ + if (!(list_is_last(&task->t_list, + &T_TASK(cmd)->t_task_list))) { + /* + * Clear existing SGL termination bit set in + * transport_calc_sg_num(), see sg_mark_end() + */ + sg_end_cur = &task->task_sg[task->task_sg_num - 1]; + sg_end_cur->page_link &= ~0x02; + + sg_chain(sg_head, task_sg_num, sg_head_cur); + sg_count += (task->task_sg_num + 1); + } else + sg_count += task->task_sg_num; + + sg_head = sg_head_cur; + sg_link = sg_link_cur; + task_sg_num = task->task_sg_num; + continue; + } + sg_head = sg_first = &task->task_sg[0]; + sg_link = &task->task_sg[task->task_sg_num]; + task_sg_num = task->task_sg_num; + /* + * Check for single task.. + */ + if (!(list_is_last(&task->t_list, &T_TASK(cmd)->t_task_list))) { + /* + * Clear existing SGL termination bit set in + * transport_calc_sg_num(), see sg_mark_end() + */ + sg_end = &task->task_sg[task->task_sg_num - 1]; + sg_end->page_link &= ~0x02; + sg_count += (task->task_sg_num + 1); + } else + sg_count += task->task_sg_num; + } + /* + * Setup the starting pointer and total t_tasks_sg_linked_no including + * padding SGs for linking and to mark the end. + */ + T_TASK(cmd)->t_tasks_sg_chained = sg_first; + T_TASK(cmd)->t_tasks_sg_chained_no = sg_count; + + DEBUG_CMD_M("Setup T_TASK(cmd)->t_tasks_sg_chained: %p and" + " t_tasks_sg_chained_no: %u\n", T_TASK(cmd)->t_tasks_sg_chained, + T_TASK(cmd)->t_tasks_sg_chained_no); + + for_each_sg(T_TASK(cmd)->t_tasks_sg_chained, sg, + T_TASK(cmd)->t_tasks_sg_chained_no, i) { + + DEBUG_CMD_M("SG: %p page: %p length: %d offset: %d\n", + sg, sg_page(sg), sg->length, sg->offset); + if (sg_is_chain(sg)) + DEBUG_CMD_M("SG: %p sg_is_chain=1\n", sg); + if (sg_is_last(sg)) + DEBUG_CMD_M("SG: %p sg_is_last=1\n", sg); + } + +} +EXPORT_SYMBOL(transport_do_task_sg_chain); + +static int transport_do_se_mem_map( + struct se_device *dev, + struct se_task *task, + struct list_head *se_mem_list, + void *in_mem, + struct se_mem *in_se_mem, + struct se_mem **out_se_mem, + u32 *se_mem_cnt, + u32 *task_offset_in) +{ + u32 task_offset = *task_offset_in; + int ret = 0; + /* + * se_subsystem_api_t->do_se_mem_map is used when internal allocation + * has been done by the transport plugin. + */ + if (TRANSPORT(dev)->do_se_mem_map) { + ret = TRANSPORT(dev)->do_se_mem_map(task, se_mem_list, + in_mem, in_se_mem, out_se_mem, se_mem_cnt, + task_offset_in); + if (ret == 0) + T_TASK(task->task_se_cmd)->t_tasks_se_num += *se_mem_cnt; + + return ret; + } + /* + * This is the normal path for all normal non BIDI and BIDI-COMMAND + * WRITE payloads.. If we need to do BIDI READ passthrough for + * TCM/pSCSI the first call to transport_do_se_mem_map -> + * transport_calc_sg_num() -> transport_map_mem_to_sg() will do the + * allocation for task->task_sg_bidi, and the subsequent call to + * transport_do_se_mem_map() from transport_generic_get_cdb_count() + */ + if (!(task->task_sg_bidi)) { + /* + * Assume default that transport plugin speaks preallocated + * scatterlists. + */ + if (!(transport_calc_sg_num(task, in_se_mem, task_offset))) + return -1; + /* + * struct se_task->task_sg now contains the struct scatterlist array. + */ + return transport_map_mem_to_sg(task, se_mem_list, task->task_sg, + in_se_mem, out_se_mem, se_mem_cnt, + task_offset_in); + } + /* + * Handle the se_mem_list -> struct task->task_sg_bidi + * memory map for the extra BIDI READ payload + */ + return transport_map_mem_to_sg(task, se_mem_list, task->task_sg_bidi, + in_se_mem, out_se_mem, se_mem_cnt, + task_offset_in); +} + +static u32 transport_generic_get_cdb_count( + struct se_cmd *cmd, + unsigned long long lba, + u32 sectors, + enum dma_data_direction data_direction, + struct list_head *mem_list, + int set_counts) +{ + unsigned char *cdb = NULL; + struct se_task *task; + struct se_mem *se_mem = NULL, *se_mem_lout = NULL; + struct se_mem *se_mem_bidi = NULL, *se_mem_bidi_lout = NULL; + struct se_device *dev = SE_DEV(cmd); + int max_sectors_set = 0, ret; + u32 task_offset_in = 0, se_mem_cnt = 0, se_mem_bidi_cnt = 0, task_cdbs = 0; + + if (!mem_list) { + printk(KERN_ERR "mem_list is NULL in transport_generic_get" + "_cdb_count()\n"); + return 0; + } + /* + * While using RAMDISK_DR backstores is the only case where + * mem_list will ever be empty at this point. + */ + if (!(list_empty(mem_list))) + se_mem = list_entry(mem_list->next, struct se_mem, se_list); + /* + * Check for extra se_mem_bidi mapping for BIDI-COMMANDs to + * struct se_task->task_sg_bidi for TCM/pSCSI passthrough operation + */ + if ((T_TASK(cmd)->t_mem_bidi_list != NULL) && + !(list_empty(T_TASK(cmd)->t_mem_bidi_list)) && + (TRANSPORT(dev)->transport_type == TRANSPORT_PLUGIN_PHBA_PDEV)) + se_mem_bidi = list_entry(T_TASK(cmd)->t_mem_bidi_list->next, + struct se_mem, se_list); + + while (sectors) { + DEBUG_VOL("ITT[0x%08x] LBA(%llu) SectorsLeft(%u) EOBJ(%llu)\n", + CMD_TFO(cmd)->get_task_tag(cmd), lba, sectors, + transport_dev_end_lba(dev)); + + task = transport_generic_get_task(cmd, data_direction); + if (!(task)) + goto out; + + transport_set_tasks_sectors(task, dev, lba, sectors, + &max_sectors_set); + + task->task_lba = lba; + lba += task->task_sectors; + sectors -= task->task_sectors; + task->task_size = (task->task_sectors * + DEV_ATTRIB(dev)->block_size); + + cdb = TRANSPORT(dev)->get_cdb(task); + if ((cdb)) { + memcpy(cdb, T_TASK(cmd)->t_task_cdb, + scsi_command_size(T_TASK(cmd)->t_task_cdb)); + cmd->transport_split_cdb(task->task_lba, + &task->task_sectors, cdb); + } + + /* + * Perform the SE OBJ plugin and/or Transport plugin specific + * mapping for T_TASK(cmd)->t_mem_list. And setup the + * task->task_sg and if necessary task->task_sg_bidi + */ + ret = transport_do_se_mem_map(dev, task, mem_list, + NULL, se_mem, &se_mem_lout, &se_mem_cnt, + &task_offset_in); + if (ret < 0) + goto out; + + se_mem = se_mem_lout; + /* + * Setup the T_TASK(cmd)->t_mem_bidi_list -> task->task_sg_bidi + * mapping for SCSI READ for BIDI-COMMAND passthrough with TCM/pSCSI + * + * Note that the first call to transport_do_se_mem_map() above will + * allocate struct se_task->task_sg_bidi in transport_do_se_mem_map() + * -> transport_calc_sg_num(), and the second here will do the + * mapping for SCSI READ for BIDI-COMMAND passthrough with TCM/pSCSI. + */ + if (task->task_sg_bidi != NULL) { + ret = transport_do_se_mem_map(dev, task, + T_TASK(cmd)->t_mem_bidi_list, NULL, + se_mem_bidi, &se_mem_bidi_lout, &se_mem_bidi_cnt, + &task_offset_in); + if (ret < 0) + goto out; + + se_mem_bidi = se_mem_bidi_lout; + } + task_cdbs++; + + DEBUG_VOL("Incremented task_cdbs(%u) task->task_sg_num(%u)\n", + task_cdbs, task->task_sg_num); + + if (max_sectors_set) { + max_sectors_set = 0; + continue; + } + + if (!sectors) + break; + } + + if (set_counts) { + atomic_inc(&T_TASK(cmd)->t_fe_count); + atomic_inc(&T_TASK(cmd)->t_se_count); + } + + DEBUG_VOL("ITT[0x%08x] total %s cdbs(%u)\n", + CMD_TFO(cmd)->get_task_tag(cmd), (data_direction == DMA_TO_DEVICE) + ? "DMA_TO_DEVICE" : "DMA_FROM_DEVICE", task_cdbs); + + return task_cdbs; +out: + return 0; +} + +static int +transport_map_control_cmd_to_task(struct se_cmd *cmd) +{ + struct se_device *dev = SE_DEV(cmd); + unsigned char *cdb; + struct se_task *task; + int ret; + + task = transport_generic_get_task(cmd, cmd->data_direction); + if (!task) + return PYX_TRANSPORT_OUT_OF_MEMORY_RESOURCES; + + cdb = TRANSPORT(dev)->get_cdb(task); + if (cdb) + memcpy(cdb, cmd->t_task->t_task_cdb, + scsi_command_size(cmd->t_task->t_task_cdb)); + + task->task_size = cmd->data_length; + task->task_sg_num = + (cmd->se_cmd_flags & SCF_SCSI_CONTROL_SG_IO_CDB) ? 1 : 0; + + atomic_inc(&cmd->t_task->t_fe_count); + atomic_inc(&cmd->t_task->t_se_count); + + if (cmd->se_cmd_flags & SCF_SCSI_CONTROL_SG_IO_CDB) { + struct se_mem *se_mem = NULL, *se_mem_lout = NULL; + u32 se_mem_cnt = 0, task_offset = 0; + + BUG_ON(list_empty(cmd->t_task->t_mem_list)); + + ret = transport_do_se_mem_map(dev, task, + cmd->t_task->t_mem_list, NULL, se_mem, + &se_mem_lout, &se_mem_cnt, &task_offset); + if (ret < 0) + return PYX_TRANSPORT_OUT_OF_MEMORY_RESOURCES; + + if (dev->transport->map_task_SG) + return dev->transport->map_task_SG(task); + return 0; + } else if (cmd->se_cmd_flags & SCF_SCSI_CONTROL_NONSG_IO_CDB) { + if (dev->transport->map_task_non_SG) + return dev->transport->map_task_non_SG(task); + return 0; + } else if (cmd->se_cmd_flags & SCF_SCSI_NON_DATA_CDB) { + if (dev->transport->cdb_none) + return dev->transport->cdb_none(task); + return 0; + } else { + BUG(); + return PYX_TRANSPORT_OUT_OF_MEMORY_RESOURCES; + } +} + +/* transport_generic_new_cmd(): Called from transport_processing_thread() + * + * Allocate storage transport resources from a set of values predefined + * by transport_generic_cmd_sequencer() from the iSCSI Target RX process. + * Any non zero return here is treated as an "out of resource' op here. + */ + /* + * Generate struct se_task(s) and/or their payloads for this CDB. + */ +static int transport_generic_new_cmd(struct se_cmd *cmd) +{ + struct se_portal_group *se_tpg; + struct se_task *task; + struct se_device *dev = SE_DEV(cmd); + int ret = 0; + + /* + * Determine is the TCM fabric module has already allocated physical + * memory, and is directly calling transport_generic_map_mem_to_cmd() + * to setup beforehand the linked list of physical memory at + * T_TASK(cmd)->t_mem_list of struct se_mem->se_page + */ + if (!(cmd->se_cmd_flags & SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC)) { + ret = transport_allocate_resources(cmd); + if (ret < 0) + return ret; + } + + ret = transport_get_sectors(cmd); + if (ret < 0) + return ret; + + ret = transport_new_cmd_obj(cmd); + if (ret < 0) + return ret; + + /* + * Determine if the calling TCM fabric module is talking to + * Linux/NET via kernel sockets and needs to allocate a + * struct iovec array to complete the struct se_cmd + */ + se_tpg = SE_LUN(cmd)->lun_sep->sep_tpg; + if (TPG_TFO(se_tpg)->alloc_cmd_iovecs != NULL) { + ret = TPG_TFO(se_tpg)->alloc_cmd_iovecs(cmd); + if (ret < 0) + return PYX_TRANSPORT_OUT_OF_MEMORY_RESOURCES; + } + + if (cmd->se_cmd_flags & SCF_SCSI_DATA_SG_IO_CDB) { + list_for_each_entry(task, &T_TASK(cmd)->t_task_list, t_list) { + if (atomic_read(&task->task_sent)) + continue; + if (!dev->transport->map_task_SG) + continue; + + ret = dev->transport->map_task_SG(task); + if (ret < 0) + return ret; + } + } else { + ret = transport_map_control_cmd_to_task(cmd); + if (ret < 0) + return ret; + } + + /* + * For WRITEs, let the iSCSI Target RX Thread know its buffer is ready.. + * This WRITE struct se_cmd (and all of its associated struct se_task's) + * will be added to the struct se_device execution queue after its WRITE + * data has arrived. (ie: It gets handled by the transport processing + * thread a second time) + */ + if (cmd->data_direction == DMA_TO_DEVICE) { + transport_add_tasks_to_state_queue(cmd); + return transport_generic_write_pending(cmd); + } + /* + * Everything else but a WRITE, add the struct se_cmd's struct se_task's + * to the execution queue. + */ + transport_execute_tasks(cmd); + return 0; +} + +/* transport_generic_process_write(): + * + * + */ +void transport_generic_process_write(struct se_cmd *cmd) +{ +#if 0 + /* + * Copy SCSI Presented DTL sector(s) from received buffers allocated to + * original EDTL + */ + if (cmd->se_cmd_flags & SCF_UNDERFLOW_BIT) { + if (!T_TASK(cmd)->t_tasks_se_num) { + unsigned char *dst, *buf = + (unsigned char *)T_TASK(cmd)->t_task_buf; + + dst = kzalloc(cmd->cmd_spdtl), GFP_KERNEL); + if (!(dst)) { + printk(KERN_ERR "Unable to allocate memory for" + " WRITE underflow\n"); + transport_generic_request_failure(cmd, NULL, + PYX_TRANSPORT_REQ_TOO_MANY_SECTORS, 1); + return; + } + memcpy(dst, buf, cmd->cmd_spdtl); + + kfree(T_TASK(cmd)->t_task_buf); + T_TASK(cmd)->t_task_buf = dst; + } else { + struct scatterlist *sg = + (struct scatterlist *sg)T_TASK(cmd)->t_task_buf; + struct scatterlist *orig_sg; + + orig_sg = kzalloc(sizeof(struct scatterlist) * + T_TASK(cmd)->t_tasks_se_num, + GFP_KERNEL))) { + if (!(orig_sg)) { + printk(KERN_ERR "Unable to allocate memory" + " for WRITE underflow\n"); + transport_generic_request_failure(cmd, NULL, + PYX_TRANSPORT_REQ_TOO_MANY_SECTORS, 1); + return; + } + + memcpy(orig_sg, T_TASK(cmd)->t_task_buf, + sizeof(struct scatterlist) * + T_TASK(cmd)->t_tasks_se_num); + + cmd->data_length = cmd->cmd_spdtl; + /* + * FIXME, clear out original struct se_task and state + * information. + */ + if (transport_generic_new_cmd(cmd) < 0) { + transport_generic_request_failure(cmd, NULL, + PYX_TRANSPORT_REQ_TOO_MANY_SECTORS, 1); + kfree(orig_sg); + return; + } + + transport_memcpy_write_sg(cmd, orig_sg); + } + } +#endif + transport_execute_tasks(cmd); +} +EXPORT_SYMBOL(transport_generic_process_write); + +/* transport_generic_write_pending(): + * + * + */ +static int transport_generic_write_pending(struct se_cmd *cmd) +{ + unsigned long flags; + int ret; + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + cmd->t_state = TRANSPORT_WRITE_PENDING; + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + /* + * For the TCM control CDBs using a contiguous buffer, do the memcpy + * from the passed Linux/SCSI struct scatterlist located at + * T_TASK(se_cmd)->t_task_pt_buf to the contiguous buffer at + * T_TASK(se_cmd)->t_task_buf. + */ + if (cmd->se_cmd_flags & SCF_PASSTHROUGH_CONTIG_TO_SG) + transport_memcpy_read_contig(cmd, + T_TASK(cmd)->t_task_buf, + T_TASK(cmd)->t_task_pt_sgl); + /* + * Clear the se_cmd for WRITE_PENDING status in order to set + * T_TASK(cmd)->t_transport_active=0 so that transport_generic_handle_data + * can be called from HW target mode interrupt code. This is safe + * to be called with transport_off=1 before the CMD_TFO(cmd)->write_pending + * because the se_cmd->se_lun pointer is not being cleared. + */ + transport_cmd_check_stop(cmd, 1, 0); + + /* + * Call the fabric write_pending function here to let the + * frontend know that WRITE buffers are ready. + */ + ret = CMD_TFO(cmd)->write_pending(cmd); + if (ret < 0) + return ret; + + return PYX_TRANSPORT_WRITE_PENDING; +} + +/* transport_release_cmd_to_pool(): + * + * + */ +void transport_release_cmd_to_pool(struct se_cmd *cmd) +{ + BUG_ON(!T_TASK(cmd)); + BUG_ON(!CMD_TFO(cmd)); + + transport_free_se_cmd(cmd); + CMD_TFO(cmd)->release_cmd_to_pool(cmd); +} +EXPORT_SYMBOL(transport_release_cmd_to_pool); + +/* transport_generic_free_cmd(): + * + * Called from processing frontend to release storage engine resources + */ +void transport_generic_free_cmd( + struct se_cmd *cmd, + int wait_for_tasks, + int release_to_pool, + int session_reinstatement) +{ + if (!(cmd->se_cmd_flags & SCF_SE_LUN_CMD) || !T_TASK(cmd)) + transport_release_cmd_to_pool(cmd); + else { + core_dec_lacl_count(cmd->se_sess->se_node_acl, cmd); + + if (SE_LUN(cmd)) { +#if 0 + printk(KERN_INFO "cmd: %p ITT: 0x%08x contains" + " SE_LUN(cmd)\n", cmd, + CMD_TFO(cmd)->get_task_tag(cmd)); +#endif + transport_lun_remove_cmd(cmd); + } + + if (wait_for_tasks && cmd->transport_wait_for_tasks) + cmd->transport_wait_for_tasks(cmd, 0, 0); + + transport_generic_remove(cmd, release_to_pool, + session_reinstatement); + } +} +EXPORT_SYMBOL(transport_generic_free_cmd); + +static void transport_nop_wait_for_tasks( + struct se_cmd *cmd, + int remove_cmd, + int session_reinstatement) +{ + return; +} + +/* transport_lun_wait_for_tasks(): + * + * Called from ConfigFS context to stop the passed struct se_cmd to allow + * an struct se_lun to be successfully shutdown. + */ +static int transport_lun_wait_for_tasks(struct se_cmd *cmd, struct se_lun *lun) +{ + unsigned long flags; + int ret; + /* + * If the frontend has already requested this struct se_cmd to + * be stopped, we can safely ignore this struct se_cmd. + */ + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + if (atomic_read(&T_TASK(cmd)->t_transport_stop)) { + atomic_set(&T_TASK(cmd)->transport_lun_stop, 0); + DEBUG_TRANSPORT_S("ConfigFS ITT[0x%08x] - t_transport_stop ==" + " TRUE, skipping\n", CMD_TFO(cmd)->get_task_tag(cmd)); + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + transport_cmd_check_stop(cmd, 1, 0); + return -1; + } + atomic_set(&T_TASK(cmd)->transport_lun_fe_stop, 1); + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + + wake_up_interruptible(&SE_DEV(cmd)->dev_queue_obj->thread_wq); + + ret = transport_stop_tasks_for_cmd(cmd); + + DEBUG_TRANSPORT_S("ConfigFS: cmd: %p t_task_cdbs: %d stop tasks ret:" + " %d\n", cmd, T_TASK(cmd)->t_task_cdbs, ret); + if (!ret) { + DEBUG_TRANSPORT_S("ConfigFS: ITT[0x%08x] - stopping cmd....\n", + CMD_TFO(cmd)->get_task_tag(cmd)); + wait_for_completion(&T_TASK(cmd)->transport_lun_stop_comp); + DEBUG_TRANSPORT_S("ConfigFS: ITT[0x%08x] - stopped cmd....\n", + CMD_TFO(cmd)->get_task_tag(cmd)); + } + transport_remove_cmd_from_queue(cmd, SE_DEV(cmd)->dev_queue_obj); + + return 0; +} + +/* #define DEBUG_CLEAR_LUN */ +#ifdef DEBUG_CLEAR_LUN +#define DEBUG_CLEAR_L(x...) printk(KERN_INFO x) +#else +#define DEBUG_CLEAR_L(x...) +#endif + +static void __transport_clear_lun_from_sessions(struct se_lun *lun) +{ + struct se_cmd *cmd = NULL; + unsigned long lun_flags, cmd_flags; + /* + * Do exception processing and return CHECK_CONDITION status to the + * Initiator Port. + */ + spin_lock_irqsave(&lun->lun_cmd_lock, lun_flags); + while (!list_empty_careful(&lun->lun_cmd_list)) { + cmd = list_entry(lun->lun_cmd_list.next, + struct se_cmd, se_lun_list); + list_del(&cmd->se_lun_list); + + if (!(T_TASK(cmd))) { + printk(KERN_ERR "ITT: 0x%08x, T_TASK(cmd) = NULL" + "[i,t]_state: %u/%u\n", + CMD_TFO(cmd)->get_task_tag(cmd), + CMD_TFO(cmd)->get_cmd_state(cmd), cmd->t_state); + BUG(); + } + atomic_set(&T_TASK(cmd)->transport_lun_active, 0); + /* + * This will notify iscsi_target_transport.c: + * transport_cmd_check_stop() that a LUN shutdown is in + * progress for the iscsi_cmd_t. + */ + spin_lock(&T_TASK(cmd)->t_state_lock); + DEBUG_CLEAR_L("SE_LUN[%d] - Setting T_TASK(cmd)->transport" + "_lun_stop for ITT: 0x%08x\n", + SE_LUN(cmd)->unpacked_lun, + CMD_TFO(cmd)->get_task_tag(cmd)); + atomic_set(&T_TASK(cmd)->transport_lun_stop, 1); + spin_unlock(&T_TASK(cmd)->t_state_lock); + + spin_unlock_irqrestore(&lun->lun_cmd_lock, lun_flags); + + if (!(SE_LUN(cmd))) { + printk(KERN_ERR "ITT: 0x%08x, [i,t]_state: %u/%u\n", + CMD_TFO(cmd)->get_task_tag(cmd), + CMD_TFO(cmd)->get_cmd_state(cmd), cmd->t_state); + BUG(); + } + /* + * If the Storage engine still owns the iscsi_cmd_t, determine + * and/or stop its context. + */ + DEBUG_CLEAR_L("SE_LUN[%d] - ITT: 0x%08x before transport" + "_lun_wait_for_tasks()\n", SE_LUN(cmd)->unpacked_lun, + CMD_TFO(cmd)->get_task_tag(cmd)); + + if (transport_lun_wait_for_tasks(cmd, SE_LUN(cmd)) < 0) { + spin_lock_irqsave(&lun->lun_cmd_lock, lun_flags); + continue; + } + + DEBUG_CLEAR_L("SE_LUN[%d] - ITT: 0x%08x after transport_lun" + "_wait_for_tasks(): SUCCESS\n", + SE_LUN(cmd)->unpacked_lun, + CMD_TFO(cmd)->get_task_tag(cmd)); + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, cmd_flags); + if (!(atomic_read(&T_TASK(cmd)->transport_dev_active))) { + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, cmd_flags); + goto check_cond; + } + atomic_set(&T_TASK(cmd)->transport_dev_active, 0); + transport_all_task_dev_remove_state(cmd); + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, cmd_flags); + + transport_free_dev_tasks(cmd); + /* + * The Storage engine stopped this struct se_cmd before it was + * send to the fabric frontend for delivery back to the + * Initiator Node. Return this SCSI CDB back with an + * CHECK_CONDITION status. + */ +check_cond: + transport_send_check_condition_and_sense(cmd, + TCM_NON_EXISTENT_LUN, 0); + /* + * If the fabric frontend is waiting for this iscsi_cmd_t to + * be released, notify the waiting thread now that LU has + * finished accessing it. + */ + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, cmd_flags); + if (atomic_read(&T_TASK(cmd)->transport_lun_fe_stop)) { + DEBUG_CLEAR_L("SE_LUN[%d] - Detected FE stop for" + " struct se_cmd: %p ITT: 0x%08x\n", + lun->unpacked_lun, + cmd, CMD_TFO(cmd)->get_task_tag(cmd)); + + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, + cmd_flags); + transport_cmd_check_stop(cmd, 1, 0); + complete(&T_TASK(cmd)->transport_lun_fe_stop_comp); + spin_lock_irqsave(&lun->lun_cmd_lock, lun_flags); + continue; + } + DEBUG_CLEAR_L("SE_LUN[%d] - ITT: 0x%08x finished processing\n", + lun->unpacked_lun, CMD_TFO(cmd)->get_task_tag(cmd)); + + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, cmd_flags); + spin_lock_irqsave(&lun->lun_cmd_lock, lun_flags); + } + spin_unlock_irqrestore(&lun->lun_cmd_lock, lun_flags); +} + +static int transport_clear_lun_thread(void *p) +{ + struct se_lun *lun = (struct se_lun *)p; + + __transport_clear_lun_from_sessions(lun); + complete(&lun->lun_shutdown_comp); + + return 0; +} + +int transport_clear_lun_from_sessions(struct se_lun *lun) +{ + struct task_struct *kt; + + kt = kthread_run(transport_clear_lun_thread, (void *)lun, + "tcm_cl_%u", lun->unpacked_lun); + if (IS_ERR(kt)) { + printk(KERN_ERR "Unable to start clear_lun thread\n"); + return -1; + } + wait_for_completion(&lun->lun_shutdown_comp); + + return 0; +} + +/* transport_generic_wait_for_tasks(): + * + * Called from frontend or passthrough context to wait for storage engine + * to pause and/or release frontend generated struct se_cmd. + */ +static void transport_generic_wait_for_tasks( + struct se_cmd *cmd, + int remove_cmd, + int session_reinstatement) +{ + unsigned long flags; + + if (!(cmd->se_cmd_flags & SCF_SE_LUN_CMD) && !(cmd->se_tmr_req)) + return; + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + /* + * If we are already stopped due to an external event (ie: LUN shutdown) + * sleep until the connection can have the passed struct se_cmd back. + * The T_TASK(cmd)->transport_lun_stopped_sem will be upped by + * transport_clear_lun_from_sessions() once the ConfigFS context caller + * has completed its operation on the struct se_cmd. + */ + if (atomic_read(&T_TASK(cmd)->transport_lun_stop)) { + + DEBUG_TRANSPORT_S("wait_for_tasks: Stopping" + " wait_for_completion(&T_TASK(cmd)transport_lun_fe" + "_stop_comp); for ITT: 0x%08x\n", + CMD_TFO(cmd)->get_task_tag(cmd)); + /* + * There is a special case for WRITES where a FE exception + + * LUN shutdown means ConfigFS context is still sleeping on + * transport_lun_stop_comp in transport_lun_wait_for_tasks(). + * We go ahead and up transport_lun_stop_comp just to be sure + * here. + */ + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + complete(&T_TASK(cmd)->transport_lun_stop_comp); + wait_for_completion(&T_TASK(cmd)->transport_lun_fe_stop_comp); + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + + transport_all_task_dev_remove_state(cmd); + /* + * At this point, the frontend who was the originator of this + * struct se_cmd, now owns the structure and can be released through + * normal means below. + */ + DEBUG_TRANSPORT_S("wait_for_tasks: Stopped" + " wait_for_completion(&T_TASK(cmd)transport_lun_fe_" + "stop_comp); for ITT: 0x%08x\n", + CMD_TFO(cmd)->get_task_tag(cmd)); + + atomic_set(&T_TASK(cmd)->transport_lun_stop, 0); + } + if (!atomic_read(&T_TASK(cmd)->t_transport_active)) + goto remove; + + atomic_set(&T_TASK(cmd)->t_transport_stop, 1); + + DEBUG_TRANSPORT_S("wait_for_tasks: Stopping %p ITT: 0x%08x" + " i_state: %d, t_state/def_t_state: %d/%d, t_transport_stop" + " = TRUE\n", cmd, CMD_TFO(cmd)->get_task_tag(cmd), + CMD_TFO(cmd)->get_cmd_state(cmd), cmd->t_state, + cmd->deferred_t_state); + + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + + wake_up_interruptible(&SE_DEV(cmd)->dev_queue_obj->thread_wq); + + wait_for_completion(&T_TASK(cmd)->t_transport_stop_comp); + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + atomic_set(&T_TASK(cmd)->t_transport_active, 0); + atomic_set(&T_TASK(cmd)->t_transport_stop, 0); + + DEBUG_TRANSPORT_S("wait_for_tasks: Stopped wait_for_compltion(" + "&T_TASK(cmd)->t_transport_stop_comp) for ITT: 0x%08x\n", + CMD_TFO(cmd)->get_task_tag(cmd)); +remove: + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + if (!remove_cmd) + return; + + transport_generic_free_cmd(cmd, 0, 0, session_reinstatement); +} + +static int transport_get_sense_codes( + struct se_cmd *cmd, + u8 *asc, + u8 *ascq) +{ + *asc = cmd->scsi_asc; + *ascq = cmd->scsi_ascq; + + return 0; +} + +static int transport_set_sense_codes( + struct se_cmd *cmd, + u8 asc, + u8 ascq) +{ + cmd->scsi_asc = asc; + cmd->scsi_ascq = ascq; + + return 0; +} + +int transport_send_check_condition_and_sense( + struct se_cmd *cmd, + u8 reason, + int from_transport) +{ + unsigned char *buffer = cmd->sense_buffer; + unsigned long flags; + int offset; + u8 asc = 0, ascq = 0; + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + if (cmd->se_cmd_flags & SCF_SENT_CHECK_CONDITION) { + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + return 0; + } + cmd->se_cmd_flags |= SCF_SENT_CHECK_CONDITION; + spin_unlock_irqrestore(&T_TASK(cmd)->t_state_lock, flags); + + if (!reason && from_transport) + goto after_reason; + + if (!from_transport) + cmd->se_cmd_flags |= SCF_EMULATED_TASK_SENSE; + /* + * Data Segment and SenseLength of the fabric response PDU. + * + * TRANSPORT_SENSE_BUFFER is now set to SCSI_SENSE_BUFFERSIZE + * from include/scsi/scsi_cmnd.h + */ + offset = CMD_TFO(cmd)->set_fabric_sense_len(cmd, + TRANSPORT_SENSE_BUFFER); + /* + * Actual SENSE DATA, see SPC-3 7.23.2 SPC_SENSE_KEY_OFFSET uses + * SENSE KEY values from include/scsi/scsi.h + */ + switch (reason) { + case TCM_NON_EXISTENT_LUN: + case TCM_UNSUPPORTED_SCSI_OPCODE: + case TCM_SECTOR_COUNT_TOO_MANY: + /* CURRENT ERROR */ + buffer[offset] = 0x70; + /* ILLEGAL REQUEST */ + buffer[offset+SPC_SENSE_KEY_OFFSET] = ILLEGAL_REQUEST; + /* INVALID COMMAND OPERATION CODE */ + buffer[offset+SPC_ASC_KEY_OFFSET] = 0x20; + break; + case TCM_UNKNOWN_MODE_PAGE: + /* CURRENT ERROR */ + buffer[offset] = 0x70; + /* ILLEGAL REQUEST */ + buffer[offset+SPC_SENSE_KEY_OFFSET] = ILLEGAL_REQUEST; + /* INVALID FIELD IN CDB */ + buffer[offset+SPC_ASC_KEY_OFFSET] = 0x24; + break; + case TCM_CHECK_CONDITION_ABORT_CMD: + /* CURRENT ERROR */ + buffer[offset] = 0x70; + /* ABORTED COMMAND */ + buffer[offset+SPC_SENSE_KEY_OFFSET] = ABORTED_COMMAND; + /* BUS DEVICE RESET FUNCTION OCCURRED */ + buffer[offset+SPC_ASC_KEY_OFFSET] = 0x29; + buffer[offset+SPC_ASCQ_KEY_OFFSET] = 0x03; + break; + case TCM_INCORRECT_AMOUNT_OF_DATA: + /* CURRENT ERROR */ + buffer[offset] = 0x70; + /* ABORTED COMMAND */ + buffer[offset+SPC_SENSE_KEY_OFFSET] = ABORTED_COMMAND; + /* WRITE ERROR */ + buffer[offset+SPC_ASC_KEY_OFFSET] = 0x0c; + /* NOT ENOUGH UNSOLICITED DATA */ + buffer[offset+SPC_ASCQ_KEY_OFFSET] = 0x0d; + break; + case TCM_INVALID_CDB_FIELD: + /* CURRENT ERROR */ + buffer[offset] = 0x70; + /* ABORTED COMMAND */ + buffer[offset+SPC_SENSE_KEY_OFFSET] = ABORTED_COMMAND; + /* INVALID FIELD IN CDB */ + buffer[offset+SPC_ASC_KEY_OFFSET] = 0x24; + break; + case TCM_INVALID_PARAMETER_LIST: + /* CURRENT ERROR */ + buffer[offset] = 0x70; + /* ABORTED COMMAND */ + buffer[offset+SPC_SENSE_KEY_OFFSET] = ABORTED_COMMAND; + /* INVALID FIELD IN PARAMETER LIST */ + buffer[offset+SPC_ASC_KEY_OFFSET] = 0x26; + break; + case TCM_UNEXPECTED_UNSOLICITED_DATA: + /* CURRENT ERROR */ + buffer[offset] = 0x70; + /* ABORTED COMMAND */ + buffer[offset+SPC_SENSE_KEY_OFFSET] = ABORTED_COMMAND; + /* WRITE ERROR */ + buffer[offset+SPC_ASC_KEY_OFFSET] = 0x0c; + /* UNEXPECTED_UNSOLICITED_DATA */ + buffer[offset+SPC_ASCQ_KEY_OFFSET] = 0x0c; + break; + case TCM_SERVICE_CRC_ERROR: + /* CURRENT ERROR */ + buffer[offset] = 0x70; + /* ABORTED COMMAND */ + buffer[offset+SPC_SENSE_KEY_OFFSET] = ABORTED_COMMAND; + /* PROTOCOL SERVICE CRC ERROR */ + buffer[offset+SPC_ASC_KEY_OFFSET] = 0x47; + /* N/A */ + buffer[offset+SPC_ASCQ_KEY_OFFSET] = 0x05; + break; + case TCM_SNACK_REJECTED: + /* CURRENT ERROR */ + buffer[offset] = 0x70; + /* ABORTED COMMAND */ + buffer[offset+SPC_SENSE_KEY_OFFSET] = ABORTED_COMMAND; + /* READ ERROR */ + buffer[offset+SPC_ASC_KEY_OFFSET] = 0x11; + /* FAILED RETRANSMISSION REQUEST */ + buffer[offset+SPC_ASCQ_KEY_OFFSET] = 0x13; + break; + case TCM_WRITE_PROTECTED: + /* CURRENT ERROR */ + buffer[offset] = 0x70; + /* DATA PROTECT */ + buffer[offset+SPC_SENSE_KEY_OFFSET] = DATA_PROTECT; + /* WRITE PROTECTED */ + buffer[offset+SPC_ASC_KEY_OFFSET] = 0x27; + break; + case TCM_CHECK_CONDITION_UNIT_ATTENTION: + /* CURRENT ERROR */ + buffer[offset] = 0x70; + /* UNIT ATTENTION */ + buffer[offset+SPC_SENSE_KEY_OFFSET] = UNIT_ATTENTION; + core_scsi3_ua_for_check_condition(cmd, &asc, &ascq); + buffer[offset+SPC_ASC_KEY_OFFSET] = asc; + buffer[offset+SPC_ASCQ_KEY_OFFSET] = ascq; + break; + case TCM_CHECK_CONDITION_NOT_READY: + /* CURRENT ERROR */ + buffer[offset] = 0x70; + /* Not Ready */ + buffer[offset+SPC_SENSE_KEY_OFFSET] = NOT_READY; + transport_get_sense_codes(cmd, &asc, &ascq); + buffer[offset+SPC_ASC_KEY_OFFSET] = asc; + buffer[offset+SPC_ASCQ_KEY_OFFSET] = ascq; + break; + case TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE: + default: + /* CURRENT ERROR */ + buffer[offset] = 0x70; + /* ILLEGAL REQUEST */ + buffer[offset+SPC_SENSE_KEY_OFFSET] = ILLEGAL_REQUEST; + /* LOGICAL UNIT COMMUNICATION FAILURE */ + buffer[offset+SPC_ASC_KEY_OFFSET] = 0x80; + break; + } + /* + * This code uses linux/include/scsi/scsi.h SAM status codes! + */ + cmd->scsi_status = SAM_STAT_CHECK_CONDITION; + /* + * Automatically padded, this value is encoded in the fabric's + * data_length response PDU containing the SCSI defined sense data. + */ + cmd->scsi_sense_length = TRANSPORT_SENSE_BUFFER + offset; + +after_reason: + CMD_TFO(cmd)->queue_status(cmd); + return 0; +} +EXPORT_SYMBOL(transport_send_check_condition_and_sense); + +int transport_check_aborted_status(struct se_cmd *cmd, int send_status) +{ + int ret = 0; + + if (atomic_read(&T_TASK(cmd)->t_transport_aborted) != 0) { + if (!(send_status) || + (cmd->se_cmd_flags & SCF_SENT_DELAYED_TAS)) + return 1; +#if 0 + printk(KERN_INFO "Sending delayed SAM_STAT_TASK_ABORTED" + " status for CDB: 0x%02x ITT: 0x%08x\n", + T_TASK(cmd)->t_task_cdb[0], + CMD_TFO(cmd)->get_task_tag(cmd)); +#endif + cmd->se_cmd_flags |= SCF_SENT_DELAYED_TAS; + CMD_TFO(cmd)->queue_status(cmd); + ret = 1; + } + return ret; +} +EXPORT_SYMBOL(transport_check_aborted_status); + +void transport_send_task_abort(struct se_cmd *cmd) +{ + /* + * If there are still expected incoming fabric WRITEs, we wait + * until until they have completed before sending a TASK_ABORTED + * response. This response with TASK_ABORTED status will be + * queued back to fabric module by transport_check_aborted_status(). + */ + if (cmd->data_direction == DMA_TO_DEVICE) { + if (CMD_TFO(cmd)->write_pending_status(cmd) != 0) { + atomic_inc(&T_TASK(cmd)->t_transport_aborted); + smp_mb__after_atomic_inc(); + cmd->scsi_status = SAM_STAT_TASK_ABORTED; + transport_new_cmd_failure(cmd); + return; + } + } + cmd->scsi_status = SAM_STAT_TASK_ABORTED; +#if 0 + printk(KERN_INFO "Setting SAM_STAT_TASK_ABORTED status for CDB: 0x%02x," + " ITT: 0x%08x\n", T_TASK(cmd)->t_task_cdb[0], + CMD_TFO(cmd)->get_task_tag(cmd)); +#endif + CMD_TFO(cmd)->queue_status(cmd); +} + +/* transport_generic_do_tmr(): + * + * + */ +int transport_generic_do_tmr(struct se_cmd *cmd) +{ + struct se_cmd *ref_cmd; + struct se_device *dev = SE_DEV(cmd); + struct se_tmr_req *tmr = cmd->se_tmr_req; + int ret; + + switch (tmr->function) { + case ABORT_TASK: + ref_cmd = tmr->ref_cmd; + tmr->response = TMR_FUNCTION_REJECTED; + break; + case ABORT_TASK_SET: + case CLEAR_ACA: + case CLEAR_TASK_SET: + tmr->response = TMR_TASK_MGMT_FUNCTION_NOT_SUPPORTED; + break; + case LUN_RESET: + ret = core_tmr_lun_reset(dev, tmr, NULL, NULL); + tmr->response = (!ret) ? TMR_FUNCTION_COMPLETE : + TMR_FUNCTION_REJECTED; + break; +#if 0 + case TARGET_WARM_RESET: + transport_generic_host_reset(dev->se_hba); + tmr->response = TMR_FUNCTION_REJECTED; + break; + case TARGET_COLD_RESET: + transport_generic_host_reset(dev->se_hba); + transport_generic_cold_reset(dev->se_hba); + tmr->response = TMR_FUNCTION_REJECTED; + break; +#endif + default: + printk(KERN_ERR "Uknown TMR function: 0x%02x.\n", + tmr->function); + tmr->response = TMR_FUNCTION_REJECTED; + break; + } + + cmd->t_state = TRANSPORT_ISTATE_PROCESSING; + CMD_TFO(cmd)->queue_tm_rsp(cmd); + + transport_cmd_check_stop(cmd, 2, 0); + return 0; +} + +/* + * Called with spin_lock_irq(&dev->execute_task_lock); held + * + */ +static struct se_task * +transport_get_task_from_state_list(struct se_device *dev) +{ + struct se_task *task; + + if (list_empty(&dev->state_task_list)) + return NULL; + + list_for_each_entry(task, &dev->state_task_list, t_state_list) + break; + + list_del(&task->t_state_list); + atomic_set(&task->task_state_active, 0); + + return task; +} + +static void transport_processing_shutdown(struct se_device *dev) +{ + struct se_cmd *cmd; + struct se_queue_req *qr; + struct se_task *task; + u8 state; + unsigned long flags; + /* + * Empty the struct se_device's struct se_task state list. + */ + spin_lock_irqsave(&dev->execute_task_lock, flags); + while ((task = transport_get_task_from_state_list(dev))) { + if (!(TASK_CMD(task))) { + printk(KERN_ERR "TASK_CMD(task) is NULL!\n"); + continue; + } + cmd = TASK_CMD(task); + + if (!T_TASK(cmd)) { + printk(KERN_ERR "T_TASK(cmd) is NULL for task: %p cmd:" + " %p ITT: 0x%08x\n", task, cmd, + CMD_TFO(cmd)->get_task_tag(cmd)); + continue; + } + spin_unlock_irqrestore(&dev->execute_task_lock, flags); + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + + DEBUG_DO("PT: cmd: %p task: %p ITT/CmdSN: 0x%08x/0x%08x," + " i_state/def_i_state: %d/%d, t_state/def_t_state:" + " %d/%d cdb: 0x%02x\n", cmd, task, + CMD_TFO(cmd)->get_task_tag(cmd), cmd->cmd_sn, + CMD_TFO(cmd)->get_cmd_state(cmd), cmd->deferred_i_state, + cmd->t_state, cmd->deferred_t_state, + T_TASK(cmd)->t_task_cdb[0]); + DEBUG_DO("PT: ITT[0x%08x] - t_task_cdbs: %d t_task_cdbs_left:" + " %d t_task_cdbs_sent: %d -- t_transport_active: %d" + " t_transport_stop: %d t_transport_sent: %d\n", + CMD_TFO(cmd)->get_task_tag(cmd), + T_TASK(cmd)->t_task_cdbs, + atomic_read(&T_TASK(cmd)->t_task_cdbs_left), + atomic_read(&T_TASK(cmd)->t_task_cdbs_sent), + atomic_read(&T_TASK(cmd)->t_transport_active), + atomic_read(&T_TASK(cmd)->t_transport_stop), + atomic_read(&T_TASK(cmd)->t_transport_sent)); + + if (atomic_read(&task->task_active)) { + atomic_set(&task->task_stop, 1); + spin_unlock_irqrestore( + &T_TASK(cmd)->t_state_lock, flags); + + DEBUG_DO("Waiting for task: %p to shutdown for dev:" + " %p\n", task, dev); + wait_for_completion(&task->task_stop_comp); + DEBUG_DO("Completed task: %p shutdown for dev: %p\n", + task, dev); + + spin_lock_irqsave(&T_TASK(cmd)->t_state_lock, flags); + atomic_dec(&T_TASK(cmd)->t_task_cdbs_left); + + atomic_set(&task->task_active, 0); + atomic_set(&task->task_stop, 0); + } + __transport_stop_task_timer(task, &flags); + + if (!(atomic_dec_and_test(&T_TASK(cmd)->t_task_cdbs_ex_left))) { + spin_unlock_irqrestore( + &T_TASK(cmd)->t_state_lock, flags); + + DEBUG_DO("Skipping task: %p, dev: %p for" + " t_task_cdbs_ex_left: %d\n", task, dev, + atomic_read(&T_TASK(cmd)->t_task_cdbs_ex_left)); + + spin_lock_irqsave(&dev->execute_task_lock, flags); + continue; + } + + if (atomic_read(&T_TASK(cmd)->t_transport_active)) { + DEBUG_DO("got t_transport_active = 1 for task: %p, dev:" + " %p\n", task, dev); + + if (atomic_read(&T_TASK(cmd)->t_fe_count)) { + spin_unlock_irqrestore( + &T_TASK(cmd)->t_state_lock, flags); + transport_send_check_condition_and_sense( + cmd, TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE, + 0); + transport_remove_cmd_from_queue(cmd, + SE_DEV(cmd)->dev_queue_obj); + + transport_lun_remove_cmd(cmd); + transport_cmd_check_stop(cmd, 1, 0); + } else { + spin_unlock_irqrestore( + &T_TASK(cmd)->t_state_lock, flags); + + transport_remove_cmd_from_queue(cmd, + SE_DEV(cmd)->dev_queue_obj); + + transport_lun_remove_cmd(cmd); + + if (transport_cmd_check_stop(cmd, 1, 0)) + transport_generic_remove(cmd, 0, 0); + } + + spin_lock_irqsave(&dev->execute_task_lock, flags); + continue; + } + DEBUG_DO("Got t_transport_active = 0 for task: %p, dev: %p\n", + task, dev); + + if (atomic_read(&T_TASK(cmd)->t_fe_count)) { + spin_unlock_irqrestore( + &T_TASK(cmd)->t_state_lock, flags); + transport_send_check_condition_and_sense(cmd, + TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE, 0); + transport_remove_cmd_from_queue(cmd, + SE_DEV(cmd)->dev_queue_obj); + + transport_lun_remove_cmd(cmd); + transport_cmd_check_stop(cmd, 1, 0); + } else { + spin_unlock_irqrestore( + &T_TASK(cmd)->t_state_lock, flags); + + transport_remove_cmd_from_queue(cmd, + SE_DEV(cmd)->dev_queue_obj); + transport_lun_remove_cmd(cmd); + + if (transport_cmd_check_stop(cmd, 1, 0)) + transport_generic_remove(cmd, 0, 0); + } + + spin_lock_irqsave(&dev->execute_task_lock, flags); + } + spin_unlock_irqrestore(&dev->execute_task_lock, flags); + /* + * Empty the struct se_device's struct se_cmd list. + */ + spin_lock_irqsave(&dev->dev_queue_obj->cmd_queue_lock, flags); + while ((qr = __transport_get_qr_from_queue(dev->dev_queue_obj))) { + spin_unlock_irqrestore( + &dev->dev_queue_obj->cmd_queue_lock, flags); + cmd = (struct se_cmd *)qr->cmd; + state = qr->state; + kfree(qr); + + DEBUG_DO("From Device Queue: cmd: %p t_state: %d\n", + cmd, state); + + if (atomic_read(&T_TASK(cmd)->t_fe_count)) { + transport_send_check_condition_and_sense(cmd, + TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE, 0); + + transport_lun_remove_cmd(cmd); + transport_cmd_check_stop(cmd, 1, 0); + } else { + transport_lun_remove_cmd(cmd); + if (transport_cmd_check_stop(cmd, 1, 0)) + transport_generic_remove(cmd, 0, 0); + } + spin_lock_irqsave(&dev->dev_queue_obj->cmd_queue_lock, flags); + } + spin_unlock_irqrestore(&dev->dev_queue_obj->cmd_queue_lock, flags); +} + +/* transport_processing_thread(): + * + * + */ +static int transport_processing_thread(void *param) +{ + int ret, t_state; + struct se_cmd *cmd; + struct se_device *dev = (struct se_device *) param; + struct se_queue_req *qr; + + set_user_nice(current, -20); + + while (!kthread_should_stop()) { + ret = wait_event_interruptible(dev->dev_queue_obj->thread_wq, + atomic_read(&dev->dev_queue_obj->queue_cnt) || + kthread_should_stop()); + if (ret < 0) + goto out; + + spin_lock_irq(&dev->dev_status_lock); + if (dev->dev_status & TRANSPORT_DEVICE_SHUTDOWN) { + spin_unlock_irq(&dev->dev_status_lock); + transport_processing_shutdown(dev); + continue; + } + spin_unlock_irq(&dev->dev_status_lock); + +get_cmd: + __transport_execute_tasks(dev); + + qr = transport_get_qr_from_queue(dev->dev_queue_obj); + if (!(qr)) + continue; + + cmd = (struct se_cmd *)qr->cmd; + t_state = qr->state; + kfree(qr); + + switch (t_state) { + case TRANSPORT_NEW_CMD_MAP: + if (!(CMD_TFO(cmd)->new_cmd_map)) { + printk(KERN_ERR "CMD_TFO(cmd)->new_cmd_map is" + " NULL for TRANSPORT_NEW_CMD_MAP\n"); + BUG(); + } + ret = CMD_TFO(cmd)->new_cmd_map(cmd); + if (ret < 0) { + cmd->transport_error_status = ret; + transport_generic_request_failure(cmd, NULL, + 0, (cmd->data_direction != + DMA_TO_DEVICE)); + break; + } + /* Fall through */ + case TRANSPORT_NEW_CMD: + ret = transport_generic_new_cmd(cmd); + if (ret < 0) { + cmd->transport_error_status = ret; + transport_generic_request_failure(cmd, NULL, + 0, (cmd->data_direction != + DMA_TO_DEVICE)); + } + break; + case TRANSPORT_PROCESS_WRITE: + transport_generic_process_write(cmd); + break; + case TRANSPORT_COMPLETE_OK: + transport_stop_all_task_timers(cmd); + transport_generic_complete_ok(cmd); + break; + case TRANSPORT_REMOVE: + transport_generic_remove(cmd, 1, 0); + break; + case TRANSPORT_PROCESS_TMR: + transport_generic_do_tmr(cmd); + break; + case TRANSPORT_COMPLETE_FAILURE: + transport_generic_request_failure(cmd, NULL, 1, 1); + break; + case TRANSPORT_COMPLETE_TIMEOUT: + transport_stop_all_task_timers(cmd); + transport_generic_request_timeout(cmd); + break; + default: + printk(KERN_ERR "Unknown t_state: %d deferred_t_state:" + " %d for ITT: 0x%08x i_state: %d on SE LUN:" + " %u\n", t_state, cmd->deferred_t_state, + CMD_TFO(cmd)->get_task_tag(cmd), + CMD_TFO(cmd)->get_cmd_state(cmd), + SE_LUN(cmd)->unpacked_lun); + BUG(); + } + + goto get_cmd; + } + +out: + transport_release_all_cmds(dev); + dev->process_thread = NULL; + return 0; +} diff --git a/drivers/target/target_core_ua.c b/drivers/target/target_core_ua.c new file mode 100644 index 000000000000..a2ef346087e8 --- /dev/null +++ b/drivers/target/target_core_ua.c @@ -0,0 +1,332 @@ +/******************************************************************************* + * Filename: target_core_ua.c + * + * This file contains logic for SPC-3 Unit Attention emulation + * + * Copyright (c) 2009,2010 Rising Tide Systems + * Copyright (c) 2009,2010 Linux-iSCSI.org + * + * Nicholas A. Bellinger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + ******************************************************************************/ + +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "target_core_alua.h" +#include "target_core_hba.h" +#include "target_core_pr.h" +#include "target_core_ua.h" + +int core_scsi3_ua_check( + struct se_cmd *cmd, + unsigned char *cdb) +{ + struct se_dev_entry *deve; + struct se_session *sess = cmd->se_sess; + struct se_node_acl *nacl; + + if (!(sess)) + return 0; + + nacl = sess->se_node_acl; + if (!(nacl)) + return 0; + + deve = &nacl->device_list[cmd->orig_fe_lun]; + if (!(atomic_read(&deve->ua_count))) + return 0; + /* + * From sam4r14, section 5.14 Unit attention condition: + * + * a) if an INQUIRY command enters the enabled command state, the + * device server shall process the INQUIRY command and shall neither + * report nor clear any unit attention condition; + * b) if a REPORT LUNS command enters the enabled command state, the + * device server shall process the REPORT LUNS command and shall not + * report any unit attention condition; + * e) if a REQUEST SENSE command enters the enabled command state while + * a unit attention condition exists for the SCSI initiator port + * associated with the I_T nexus on which the REQUEST SENSE command + * was received, then the device server shall process the command + * and either: + */ + switch (cdb[0]) { + case INQUIRY: + case REPORT_LUNS: + case REQUEST_SENSE: + return 0; + default: + return -1; + } + + return -1; +} + +int core_scsi3_ua_allocate( + struct se_node_acl *nacl, + u32 unpacked_lun, + u8 asc, + u8 ascq) +{ + struct se_dev_entry *deve; + struct se_ua *ua, *ua_p, *ua_tmp; + /* + * PASSTHROUGH OPS + */ + if (!(nacl)) + return -1; + + ua = kmem_cache_zalloc(se_ua_cache, GFP_ATOMIC); + if (!(ua)) { + printk(KERN_ERR "Unable to allocate struct se_ua\n"); + return -1; + } + INIT_LIST_HEAD(&ua->ua_dev_list); + INIT_LIST_HEAD(&ua->ua_nacl_list); + + ua->ua_nacl = nacl; + ua->ua_asc = asc; + ua->ua_ascq = ascq; + + spin_lock_irq(&nacl->device_list_lock); + deve = &nacl->device_list[unpacked_lun]; + + spin_lock(&deve->ua_lock); + list_for_each_entry_safe(ua_p, ua_tmp, &deve->ua_list, ua_nacl_list) { + /* + * Do not report the same UNIT ATTENTION twice.. + */ + if ((ua_p->ua_asc == asc) && (ua_p->ua_ascq == ascq)) { + spin_unlock(&deve->ua_lock); + spin_unlock_irq(&nacl->device_list_lock); + kmem_cache_free(se_ua_cache, ua); + return 0; + } + /* + * Attach the highest priority Unit Attention to + * the head of the list following sam4r14, + * Section 5.14 Unit Attention Condition: + * + * POWER ON, RESET, OR BUS DEVICE RESET OCCURRED highest + * POWER ON OCCURRED or + * DEVICE INTERNAL RESET + * SCSI BUS RESET OCCURRED or + * MICROCODE HAS BEEN CHANGED or + * protocol specific + * BUS DEVICE RESET FUNCTION OCCURRED + * I_T NEXUS LOSS OCCURRED + * COMMANDS CLEARED BY POWER LOSS NOTIFICATION + * all others Lowest + * + * Each of the ASCQ codes listed above are defined in + * the 29h ASC family, see spc4r17 Table D.1 + */ + if (ua_p->ua_asc == 0x29) { + if ((asc == 0x29) && (ascq > ua_p->ua_ascq)) + list_add(&ua->ua_nacl_list, + &deve->ua_list); + else + list_add_tail(&ua->ua_nacl_list, + &deve->ua_list); + } else if (ua_p->ua_asc == 0x2a) { + /* + * Incoming Family 29h ASCQ codes will override + * Family 2AHh ASCQ codes for Unit Attention condition. + */ + if ((asc == 0x29) || (ascq > ua_p->ua_asc)) + list_add(&ua->ua_nacl_list, + &deve->ua_list); + else + list_add_tail(&ua->ua_nacl_list, + &deve->ua_list); + } else + list_add_tail(&ua->ua_nacl_list, + &deve->ua_list); + spin_unlock(&deve->ua_lock); + spin_unlock_irq(&nacl->device_list_lock); + + atomic_inc(&deve->ua_count); + smp_mb__after_atomic_inc(); + return 0; + } + list_add_tail(&ua->ua_nacl_list, &deve->ua_list); + spin_unlock(&deve->ua_lock); + spin_unlock_irq(&nacl->device_list_lock); + + printk(KERN_INFO "[%s]: Allocated UNIT ATTENTION, mapped LUN: %u, ASC:" + " 0x%02x, ASCQ: 0x%02x\n", + TPG_TFO(nacl->se_tpg)->get_fabric_name(), unpacked_lun, + asc, ascq); + + atomic_inc(&deve->ua_count); + smp_mb__after_atomic_inc(); + return 0; +} + +void core_scsi3_ua_release_all( + struct se_dev_entry *deve) +{ + struct se_ua *ua, *ua_p; + + spin_lock(&deve->ua_lock); + list_for_each_entry_safe(ua, ua_p, &deve->ua_list, ua_nacl_list) { + list_del(&ua->ua_nacl_list); + kmem_cache_free(se_ua_cache, ua); + + atomic_dec(&deve->ua_count); + smp_mb__after_atomic_dec(); + } + spin_unlock(&deve->ua_lock); +} + +void core_scsi3_ua_for_check_condition( + struct se_cmd *cmd, + u8 *asc, + u8 *ascq) +{ + struct se_device *dev = SE_DEV(cmd); + struct se_dev_entry *deve; + struct se_session *sess = cmd->se_sess; + struct se_node_acl *nacl; + struct se_ua *ua = NULL, *ua_p; + int head = 1; + + if (!(sess)) + return; + + nacl = sess->se_node_acl; + if (!(nacl)) + return; + + spin_lock_irq(&nacl->device_list_lock); + deve = &nacl->device_list[cmd->orig_fe_lun]; + if (!(atomic_read(&deve->ua_count))) { + spin_unlock_irq(&nacl->device_list_lock); + return; + } + /* + * The highest priority Unit Attentions are placed at the head of the + * struct se_dev_entry->ua_list, and will be returned in CHECK_CONDITION + + * sense data for the received CDB. + */ + spin_lock(&deve->ua_lock); + list_for_each_entry_safe(ua, ua_p, &deve->ua_list, ua_nacl_list) { + /* + * For ua_intlck_ctrl code not equal to 00b, only report the + * highest priority UNIT_ATTENTION and ASC/ASCQ without + * clearing it. + */ + if (DEV_ATTRIB(dev)->emulate_ua_intlck_ctrl != 0) { + *asc = ua->ua_asc; + *ascq = ua->ua_ascq; + break; + } + /* + * Otherwise for the default 00b, release the UNIT ATTENTION + * condition. Return the ASC/ASCQ of the higest priority UA + * (head of the list) in the outgoing CHECK_CONDITION + sense. + */ + if (head) { + *asc = ua->ua_asc; + *ascq = ua->ua_ascq; + head = 0; + } + list_del(&ua->ua_nacl_list); + kmem_cache_free(se_ua_cache, ua); + + atomic_dec(&deve->ua_count); + smp_mb__after_atomic_dec(); + } + spin_unlock(&deve->ua_lock); + spin_unlock_irq(&nacl->device_list_lock); + + printk(KERN_INFO "[%s]: %s UNIT ATTENTION condition with" + " INTLCK_CTRL: %d, mapped LUN: %u, got CDB: 0x%02x" + " reported ASC: 0x%02x, ASCQ: 0x%02x\n", + TPG_TFO(nacl->se_tpg)->get_fabric_name(), + (DEV_ATTRIB(dev)->emulate_ua_intlck_ctrl != 0) ? "Reporting" : + "Releasing", DEV_ATTRIB(dev)->emulate_ua_intlck_ctrl, + cmd->orig_fe_lun, T_TASK(cmd)->t_task_cdb[0], *asc, *ascq); +} + +int core_scsi3_ua_clear_for_request_sense( + struct se_cmd *cmd, + u8 *asc, + u8 *ascq) +{ + struct se_dev_entry *deve; + struct se_session *sess = cmd->se_sess; + struct se_node_acl *nacl; + struct se_ua *ua = NULL, *ua_p; + int head = 1; + + if (!(sess)) + return -1; + + nacl = sess->se_node_acl; + if (!(nacl)) + return -1; + + spin_lock_irq(&nacl->device_list_lock); + deve = &nacl->device_list[cmd->orig_fe_lun]; + if (!(atomic_read(&deve->ua_count))) { + spin_unlock_irq(&nacl->device_list_lock); + return -1; + } + /* + * The highest priority Unit Attentions are placed at the head of the + * struct se_dev_entry->ua_list. The First (and hence highest priority) + * ASC/ASCQ will be returned in REQUEST_SENSE payload data for the + * matching struct se_lun. + * + * Once the returning ASC/ASCQ values are set, we go ahead and + * release all of the Unit Attention conditions for the assoicated + * struct se_lun. + */ + spin_lock(&deve->ua_lock); + list_for_each_entry_safe(ua, ua_p, &deve->ua_list, ua_nacl_list) { + if (head) { + *asc = ua->ua_asc; + *ascq = ua->ua_ascq; + head = 0; + } + list_del(&ua->ua_nacl_list); + kmem_cache_free(se_ua_cache, ua); + + atomic_dec(&deve->ua_count); + smp_mb__after_atomic_dec(); + } + spin_unlock(&deve->ua_lock); + spin_unlock_irq(&nacl->device_list_lock); + + printk(KERN_INFO "[%s]: Released UNIT ATTENTION condition, mapped" + " LUN: %u, got REQUEST_SENSE reported ASC: 0x%02x," + " ASCQ: 0x%02x\n", TPG_TFO(nacl->se_tpg)->get_fabric_name(), + cmd->orig_fe_lun, *asc, *ascq); + + return (head) ? -1 : 0; +} diff --git a/drivers/target/target_core_ua.h b/drivers/target/target_core_ua.h new file mode 100644 index 000000000000..6e6b03460a1a --- /dev/null +++ b/drivers/target/target_core_ua.h @@ -0,0 +1,36 @@ +#ifndef TARGET_CORE_UA_H + +/* + * From spc4r17, Table D.1: ASC and ASCQ Assignement + */ +#define ASCQ_29H_POWER_ON_RESET_OR_BUS_DEVICE_RESET_OCCURED 0x00 +#define ASCQ_29H_POWER_ON_OCCURRED 0x01 +#define ASCQ_29H_SCSI_BUS_RESET_OCCURED 0x02 +#define ASCQ_29H_BUS_DEVICE_RESET_FUNCTION_OCCURRED 0x03 +#define ASCQ_29H_DEVICE_INTERNAL_RESET 0x04 +#define ASCQ_29H_TRANSCEIVER_MODE_CHANGED_TO_SINGLE_ENDED 0x05 +#define ASCQ_29H_TRANSCEIVER_MODE_CHANGED_TO_LVD 0x06 +#define ASCQ_29H_NEXUS_LOSS_OCCURRED 0x07 + +#define ASCQ_2AH_PARAMETERS_CHANGED 0x00 +#define ASCQ_2AH_MODE_PARAMETERS_CHANGED 0x01 +#define ASCQ_2AH_LOG_PARAMETERS_CHANGED 0x02 +#define ASCQ_2AH_RESERVATIONS_PREEMPTED 0x03 +#define ASCQ_2AH_RESERVATIONS_RELEASED 0x04 +#define ASCQ_2AH_REGISTRATIONS_PREEMPTED 0x05 +#define ASCQ_2AH_ASYMMETRIC_ACCESS_STATE_CHANGED 0x06 +#define ASCQ_2AH_IMPLICT_ASYMMETRIC_ACCESS_STATE_TRANSITION_FAILED 0x07 +#define ASCQ_2AH_PRIORITY_CHANGED 0x08 + +#define ASCQ_2CH_PREVIOUS_RESERVATION_CONFLICT_STATUS 0x09 + +extern struct kmem_cache *se_ua_cache; + +extern int core_scsi3_ua_check(struct se_cmd *, unsigned char *); +extern int core_scsi3_ua_allocate(struct se_node_acl *, u32, u8, u8); +extern void core_scsi3_ua_release_all(struct se_dev_entry *); +extern void core_scsi3_ua_for_check_condition(struct se_cmd *, u8 *, u8 *); +extern int core_scsi3_ua_clear_for_request_sense(struct se_cmd *, + u8 *, u8 *); + +#endif /* TARGET_CORE_UA_H */ diff --git a/include/target/configfs_macros.h b/include/target/configfs_macros.h new file mode 100644 index 000000000000..7fe74608b437 --- /dev/null +++ b/include/target/configfs_macros.h @@ -0,0 +1,147 @@ +/* -*- mode: c; c-basic-offset: 8; -*- + * vim: noexpandtab sw=8 ts=8 sts=0: + * + * configfs_macros.h - extends macros for configfs + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public + * License along with this program; if not, write to the + * Free Software Foundation, Inc., 59 Temple Place - Suite 330, + * Boston, MA 021110-1307, USA. + * + * Based on sysfs: + * sysfs is Copyright (C) 2001, 2002, 2003 Patrick Mochel + * + * Based on kobject.h: + * Copyright (c) 2002-2003 Patrick Mochel + * Copyright (c) 2002-2003 Open Source Development Labs + * + * configfs Copyright (C) 2005 Oracle. All rights reserved. + * + * Added CONFIGFS_EATTR() macros from original configfs.h macros + * Copright (C) 2008-2009 Nicholas A. Bellinger + * + * Please read Documentation/filesystems/configfs.txt before using the + * configfs interface, ESPECIALLY the parts about reference counts and + * item destructors. + */ + +#ifndef _CONFIGFS_MACROS_H_ +#define _CONFIGFS_MACROS_H_ + +#include + +/* + * Users often need to create attribute structures for their configurable + * attributes, containing a configfs_attribute member and function pointers + * for the show() and store() operations on that attribute. If they don't + * need anything else on the extended attribute structure, they can use + * this macro to define it. The argument _name isends up as + * 'struct _name_attribute, as well as names of to CONFIGFS_ATTR_OPS() below. + * The argument _item is the name of the structure containing the + * struct config_item or struct config_group structure members + */ +#define CONFIGFS_EATTR_STRUCT(_name, _item) \ +struct _name##_attribute { \ + struct configfs_attribute attr; \ + ssize_t (*show)(struct _item *, char *); \ + ssize_t (*store)(struct _item *, const char *, size_t); \ +} + +/* + * With the extended attribute structure, users can use this macro + * (similar to sysfs' __ATTR) to make defining attributes easier. + * An example: + * #define MYITEM_EATTR(_name, _mode, _show, _store) \ + * struct myitem_attribute childless_attr_##_name = \ + * __CONFIGFS_EATTR(_name, _mode, _show, _store) + */ +#define __CONFIGFS_EATTR(_name, _mode, _show, _store) \ +{ \ + .attr = { \ + .ca_name = __stringify(_name), \ + .ca_mode = _mode, \ + .ca_owner = THIS_MODULE, \ + }, \ + .show = _show, \ + .store = _store, \ +} +/* Here is a readonly version, only requiring a show() operation */ +#define __CONFIGFS_EATTR_RO(_name, _show) \ +{ \ + .attr = { \ + .ca_name = __stringify(_name), \ + .ca_mode = 0444, \ + .ca_owner = THIS_MODULE, \ + }, \ + .show = _show, \ +} + +/* + * With these extended attributes, the simple show_attribute() and + * store_attribute() operations need to call the show() and store() of the + * attributes. This is a common pattern, so we provide a macro to define + * them. The argument _name is the name of the attribute defined by + * CONFIGFS_ATTR_STRUCT(). The argument _item is the name of the structure + * containing the struct config_item or struct config_group structure member. + * The argument _item_member is the actual name of the struct config_* struct + * in your _item structure. Meaning my_structure->some_config_group. + * ^^_item^^^^^ ^^_item_member^^^ + * This macro expects the attributes to be named "struct _attribute". + */ +#define CONFIGFS_EATTR_OPS_TO_FUNC(_name, _item, _item_member) \ +static struct _item *to_##_name(struct config_item *ci) \ +{ \ + return (ci) ? container_of(to_config_group(ci), struct _item, \ + _item_member) : NULL; \ +} + +#define CONFIGFS_EATTR_OPS_SHOW(_name, _item) \ +static ssize_t _name##_attr_show(struct config_item *item, \ + struct configfs_attribute *attr, \ + char *page) \ +{ \ + struct _item *_item = to_##_name(item); \ + struct _name##_attribute * _name##_attr = \ + container_of(attr, struct _name##_attribute, attr); \ + ssize_t ret = 0; \ + \ + if (_name##_attr->show) \ + ret = _name##_attr->show(_item, page); \ + return ret; \ +} + +#define CONFIGFS_EATTR_OPS_STORE(_name, _item) \ +static ssize_t _name##_attr_store(struct config_item *item, \ + struct configfs_attribute *attr, \ + const char *page, size_t count) \ +{ \ + struct _item *_item = to_##_name(item); \ + struct _name##_attribute * _name##_attr = \ + container_of(attr, struct _name##_attribute, attr); \ + ssize_t ret = -EINVAL; \ + \ + if (_name##_attr->store) \ + ret = _name##_attr->store(_item, page, count); \ + return ret; \ +} + +#define CONFIGFS_EATTR_OPS(_name, _item, _item_member) \ + CONFIGFS_EATTR_OPS_TO_FUNC(_name, _item, _item_member); \ + CONFIGFS_EATTR_OPS_SHOW(_name, _item); \ + CONFIGFS_EATTR_OPS_STORE(_name, _item); + +#define CONFIGFS_EATTR_OPS_RO(_name, _item, _item_member) \ + CONFIGFS_EATTR_OPS_TO_FUNC(_name, _item, _item_member); \ + CONFIGFS_EATTR_OPS_SHOW(_name, _item); + +#endif /* _CONFIGFS_MACROS_H_ */ diff --git a/include/target/target_core_base.h b/include/target/target_core_base.h new file mode 100644 index 000000000000..07fdfb6b9a9a --- /dev/null +++ b/include/target/target_core_base.h @@ -0,0 +1,937 @@ +#ifndef TARGET_CORE_BASE_H +#define TARGET_CORE_BASE_H + +#include +#include +#include +#include +#include +#include +#include +#include "target_core_mib.h" + +#define TARGET_CORE_MOD_VERSION "v4.0.0-rc6" +#define SHUTDOWN_SIGS (sigmask(SIGKILL)|sigmask(SIGINT)|sigmask(SIGABRT)) + +/* Used by transport_generic_allocate_iovecs() */ +#define TRANSPORT_IOV_DATA_BUFFER 5 +/* Maximum Number of LUNs per Target Portal Group */ +#define TRANSPORT_MAX_LUNS_PER_TPG 256 +/* + * By default we use 32-byte CDBs in TCM Core and subsystem plugin code. + * + * Note that both include/scsi/scsi_cmnd.h:MAX_COMMAND_SIZE and + * include/linux/blkdev.h:BLOCK_MAX_CDB as of v2.6.36-rc4 still use + * 16-byte CDBs by default and require an extra allocation for + * 32-byte CDBs to becasue of legacy issues. + * + * Within TCM Core there are no such legacy limitiations, so we go ahead + * use 32-byte CDBs by default and use include/scsi/scsi.h:scsi_command_size() + * within all TCM Core and subsystem plugin code. + */ +#define TCM_MAX_COMMAND_SIZE 32 +/* + * From include/scsi/scsi_cmnd.h:SCSI_SENSE_BUFFERSIZE, currently + * defined 96, but the real limit is 252 (or 260 including the header) + */ +#define TRANSPORT_SENSE_BUFFER SCSI_SENSE_BUFFERSIZE +/* Used by transport_send_check_condition_and_sense() */ +#define SPC_SENSE_KEY_OFFSET 2 +#define SPC_ASC_KEY_OFFSET 12 +#define SPC_ASCQ_KEY_OFFSET 13 +#define TRANSPORT_IQN_LEN 224 +/* Used by target_core_store_alua_lu_gp() and target_core_alua_lu_gp_show_attr_members() */ +#define LU_GROUP_NAME_BUF 256 +/* Used by core_alua_store_tg_pt_gp_info() and target_core_alua_tg_pt_gp_show_attr_members() */ +#define TG_PT_GROUP_NAME_BUF 256 +/* Used to parse VPD into struct t10_vpd */ +#define VPD_TMP_BUF_SIZE 128 +/* Used by transport_generic_cmd_sequencer() */ +#define READ_BLOCK_LEN 6 +#define READ_CAP_LEN 8 +#define READ_POSITION_LEN 20 +#define INQUIRY_LEN 36 +/* Used by transport_get_inquiry_vpd_serial() */ +#define INQUIRY_VPD_SERIAL_LEN 254 +/* Used by transport_get_inquiry_vpd_device_ident() */ +#define INQUIRY_VPD_DEVICE_IDENTIFIER_LEN 254 + +/* struct se_hba->hba_flags */ +enum hba_flags_table { + HBA_FLAGS_INTERNAL_USE = 0x01, + HBA_FLAGS_PSCSI_MODE = 0x02, +}; + +/* struct se_lun->lun_status */ +enum transport_lun_status_table { + TRANSPORT_LUN_STATUS_FREE = 0, + TRANSPORT_LUN_STATUS_ACTIVE = 1, +}; + +/* struct se_portal_group->se_tpg_type */ +enum transport_tpg_type_table { + TRANSPORT_TPG_TYPE_NORMAL = 0, + TRANSPORT_TPG_TYPE_DISCOVERY = 1, +}; + +/* Used for generate timer flags */ +enum timer_flags_table { + TF_RUNNING = 0x01, + TF_STOP = 0x02, +}; + +/* Special transport agnostic struct se_cmd->t_states */ +enum transport_state_table { + TRANSPORT_NO_STATE = 0, + TRANSPORT_NEW_CMD = 1, + TRANSPORT_DEFERRED_CMD = 2, + TRANSPORT_WRITE_PENDING = 3, + TRANSPORT_PROCESS_WRITE = 4, + TRANSPORT_PROCESSING = 5, + TRANSPORT_COMPLETE_OK = 6, + TRANSPORT_COMPLETE_FAILURE = 7, + TRANSPORT_COMPLETE_TIMEOUT = 8, + TRANSPORT_PROCESS_TMR = 9, + TRANSPORT_TMR_COMPLETE = 10, + TRANSPORT_ISTATE_PROCESSING = 11, + TRANSPORT_ISTATE_PROCESSED = 12, + TRANSPORT_KILL = 13, + TRANSPORT_REMOVE = 14, + TRANSPORT_FREE = 15, + TRANSPORT_NEW_CMD_MAP = 16, +}; + +/* Used for struct se_cmd->se_cmd_flags */ +enum se_cmd_flags_table { + SCF_SUPPORTED_SAM_OPCODE = 0x00000001, + SCF_TRANSPORT_TASK_SENSE = 0x00000002, + SCF_EMULATED_TASK_SENSE = 0x00000004, + SCF_SCSI_DATA_SG_IO_CDB = 0x00000008, + SCF_SCSI_CONTROL_SG_IO_CDB = 0x00000010, + SCF_SCSI_CONTROL_NONSG_IO_CDB = 0x00000020, + SCF_SCSI_NON_DATA_CDB = 0x00000040, + SCF_SCSI_CDB_EXCEPTION = 0x00000080, + SCF_SCSI_RESERVATION_CONFLICT = 0x00000100, + SCF_CMD_PASSTHROUGH_NOALLOC = 0x00000200, + SCF_SE_CMD_FAILED = 0x00000400, + SCF_SE_LUN_CMD = 0x00000800, + SCF_SE_ALLOW_EOO = 0x00001000, + SCF_SE_DISABLE_ONLINE_CHECK = 0x00002000, + SCF_SENT_CHECK_CONDITION = 0x00004000, + SCF_OVERFLOW_BIT = 0x00008000, + SCF_UNDERFLOW_BIT = 0x00010000, + SCF_SENT_DELAYED_TAS = 0x00020000, + SCF_ALUA_NON_OPTIMIZED = 0x00040000, + SCF_DELAYED_CMD_FROM_SAM_ATTR = 0x00080000, + SCF_PASSTHROUGH_SG_TO_MEM = 0x00100000, + SCF_PASSTHROUGH_CONTIG_TO_SG = 0x00200000, + SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC = 0x00400000, + SCF_EMULATE_SYNC_CACHE = 0x00800000, + SCF_EMULATE_CDB_ASYNC = 0x01000000, + SCF_EMULATE_SYNC_UNMAP = 0x02000000 +}; + +/* struct se_dev_entry->lun_flags and struct se_lun->lun_access */ +enum transport_lunflags_table { + TRANSPORT_LUNFLAGS_NO_ACCESS = 0x00, + TRANSPORT_LUNFLAGS_INITIATOR_ACCESS = 0x01, + TRANSPORT_LUNFLAGS_READ_ONLY = 0x02, + TRANSPORT_LUNFLAGS_READ_WRITE = 0x04, +}; + +/* struct se_device->dev_status */ +enum transport_device_status_table { + TRANSPORT_DEVICE_ACTIVATED = 0x01, + TRANSPORT_DEVICE_DEACTIVATED = 0x02, + TRANSPORT_DEVICE_QUEUE_FULL = 0x04, + TRANSPORT_DEVICE_SHUTDOWN = 0x08, + TRANSPORT_DEVICE_OFFLINE_ACTIVATED = 0x10, + TRANSPORT_DEVICE_OFFLINE_DEACTIVATED = 0x20, +}; + +/* + * Used by transport_send_check_condition_and_sense() and se_cmd->scsi_sense_reason + * to signal which ASC/ASCQ sense payload should be built. + */ +enum tcm_sense_reason_table { + TCM_NON_EXISTENT_LUN = 0x01, + TCM_UNSUPPORTED_SCSI_OPCODE = 0x02, + TCM_INCORRECT_AMOUNT_OF_DATA = 0x03, + TCM_UNEXPECTED_UNSOLICITED_DATA = 0x04, + TCM_SERVICE_CRC_ERROR = 0x05, + TCM_SNACK_REJECTED = 0x06, + TCM_SECTOR_COUNT_TOO_MANY = 0x07, + TCM_INVALID_CDB_FIELD = 0x08, + TCM_INVALID_PARAMETER_LIST = 0x09, + TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE = 0x0a, + TCM_UNKNOWN_MODE_PAGE = 0x0b, + TCM_WRITE_PROTECTED = 0x0c, + TCM_CHECK_CONDITION_ABORT_CMD = 0x0d, + TCM_CHECK_CONDITION_UNIT_ATTENTION = 0x0e, + TCM_CHECK_CONDITION_NOT_READY = 0x0f, +}; + +struct se_obj { + atomic_t obj_access_count; +} ____cacheline_aligned; + +/* + * Used by TCM Core internally to signal if ALUA emulation is enabled or + * disabled, or running in with TCM/pSCSI passthrough mode + */ +typedef enum { + SPC_ALUA_PASSTHROUGH, + SPC2_ALUA_DISABLED, + SPC3_ALUA_EMULATED +} t10_alua_index_t; + +/* + * Used by TCM Core internally to signal if SAM Task Attribute emulation + * is enabled or disabled, or running in with TCM/pSCSI passthrough mode + */ +typedef enum { + SAM_TASK_ATTR_PASSTHROUGH, + SAM_TASK_ATTR_UNTAGGED, + SAM_TASK_ATTR_EMULATED +} t10_task_attr_index_t; + +struct se_cmd; + +struct t10_alua { + t10_alua_index_t alua_type; + /* ALUA Target Port Group ID */ + u16 alua_tg_pt_gps_counter; + u32 alua_tg_pt_gps_count; + spinlock_t tg_pt_gps_lock; + struct se_subsystem_dev *t10_sub_dev; + /* Used for default ALUA Target Port Group */ + struct t10_alua_tg_pt_gp *default_tg_pt_gp; + /* Used for default ALUA Target Port Group ConfigFS group */ + struct config_group alua_tg_pt_gps_group; + int (*alua_state_check)(struct se_cmd *, unsigned char *, u8 *); + struct list_head tg_pt_gps_list; +} ____cacheline_aligned; + +struct t10_alua_lu_gp { + u16 lu_gp_id; + int lu_gp_valid_id; + u32 lu_gp_members; + atomic_t lu_gp_shutdown; + atomic_t lu_gp_ref_cnt; + spinlock_t lu_gp_lock; + struct config_group lu_gp_group; + struct list_head lu_gp_list; + struct list_head lu_gp_mem_list; +} ____cacheline_aligned; + +struct t10_alua_lu_gp_member { + int lu_gp_assoc:1; + atomic_t lu_gp_mem_ref_cnt; + spinlock_t lu_gp_mem_lock; + struct t10_alua_lu_gp *lu_gp; + struct se_device *lu_gp_mem_dev; + struct list_head lu_gp_mem_list; +} ____cacheline_aligned; + +struct t10_alua_tg_pt_gp { + u16 tg_pt_gp_id; + int tg_pt_gp_valid_id; + int tg_pt_gp_alua_access_status; + int tg_pt_gp_alua_access_type; + int tg_pt_gp_nonop_delay_msecs; + int tg_pt_gp_trans_delay_msecs; + int tg_pt_gp_pref; + int tg_pt_gp_write_metadata; + /* Used by struct t10_alua_tg_pt_gp->tg_pt_gp_md_buf_len */ +#define ALUA_MD_BUF_LEN 1024 + u32 tg_pt_gp_md_buf_len; + u32 tg_pt_gp_members; + atomic_t tg_pt_gp_alua_access_state; + atomic_t tg_pt_gp_ref_cnt; + spinlock_t tg_pt_gp_lock; + struct mutex tg_pt_gp_md_mutex; + struct se_subsystem_dev *tg_pt_gp_su_dev; + struct config_group tg_pt_gp_group; + struct list_head tg_pt_gp_list; + struct list_head tg_pt_gp_mem_list; +} ____cacheline_aligned; + +struct t10_alua_tg_pt_gp_member { + int tg_pt_gp_assoc:1; + atomic_t tg_pt_gp_mem_ref_cnt; + spinlock_t tg_pt_gp_mem_lock; + struct t10_alua_tg_pt_gp *tg_pt_gp; + struct se_port *tg_pt; + struct list_head tg_pt_gp_mem_list; +} ____cacheline_aligned; + +struct t10_vpd { + unsigned char device_identifier[INQUIRY_VPD_DEVICE_IDENTIFIER_LEN]; + int protocol_identifier_set; + u32 protocol_identifier; + u32 device_identifier_code_set; + u32 association; + u32 device_identifier_type; + struct list_head vpd_list; +} ____cacheline_aligned; + +struct t10_wwn { + unsigned char vendor[8]; + unsigned char model[16]; + unsigned char revision[4]; + unsigned char unit_serial[INQUIRY_VPD_SERIAL_LEN]; + spinlock_t t10_vpd_lock; + struct se_subsystem_dev *t10_sub_dev; + struct config_group t10_wwn_group; + struct list_head t10_vpd_list; +} ____cacheline_aligned; + + +/* + * Used by TCM Core internally to signal if >= SPC-3 peristent reservations + * emulation is enabled or disabled, or running in with TCM/pSCSI passthrough + * mode + */ +typedef enum { + SPC_PASSTHROUGH, + SPC2_RESERVATIONS, + SPC3_PERSISTENT_RESERVATIONS +} t10_reservations_index_t; + +struct t10_pr_registration { + /* Used for fabrics that contain WWN+ISID */ +#define PR_REG_ISID_LEN 16 + /* PR_REG_ISID_LEN + ',i,0x' */ +#define PR_REG_ISID_ID_LEN (PR_REG_ISID_LEN + 5) + char pr_reg_isid[PR_REG_ISID_LEN]; + /* Used during APTPL metadata reading */ +#define PR_APTPL_MAX_IPORT_LEN 256 + unsigned char pr_iport[PR_APTPL_MAX_IPORT_LEN]; + /* Used during APTPL metadata reading */ +#define PR_APTPL_MAX_TPORT_LEN 256 + unsigned char pr_tport[PR_APTPL_MAX_TPORT_LEN]; + /* For writing out live meta data */ + unsigned char *pr_aptpl_buf; + u16 pr_aptpl_rpti; + u16 pr_reg_tpgt; + /* Reservation effects all target ports */ + int pr_reg_all_tg_pt; + /* Activate Persistence across Target Power Loss */ + int pr_reg_aptpl; + int pr_res_holder; + int pr_res_type; + int pr_res_scope; + /* Used for fabric initiator WWPNs using a ISID */ + int isid_present_at_reg:1; + u32 pr_res_mapped_lun; + u32 pr_aptpl_target_lun; + u32 pr_res_generation; + u64 pr_reg_bin_isid; + u64 pr_res_key; + atomic_t pr_res_holders; + struct se_node_acl *pr_reg_nacl; + struct se_dev_entry *pr_reg_deve; + struct se_lun *pr_reg_tg_pt_lun; + struct list_head pr_reg_list; + struct list_head pr_reg_abort_list; + struct list_head pr_reg_aptpl_list; + struct list_head pr_reg_atp_list; + struct list_head pr_reg_atp_mem_list; +} ____cacheline_aligned; + +/* + * This set of function pointer ops is set based upon SPC3_PERSISTENT_RESERVATIONS, + * SPC2_RESERVATIONS or SPC_PASSTHROUGH in drivers/target/target_core_pr.c: + * core_setup_reservations() + */ +struct t10_reservation_ops { + int (*t10_reservation_check)(struct se_cmd *, u32 *); + int (*t10_seq_non_holder)(struct se_cmd *, unsigned char *, u32); + int (*t10_pr_register)(struct se_cmd *); + int (*t10_pr_clear)(struct se_cmd *); +}; + +struct t10_reservation_template { + /* Reservation effects all target ports */ + int pr_all_tg_pt; + /* Activate Persistence across Target Power Loss enabled + * for SCSI device */ + int pr_aptpl_active; + /* Used by struct t10_reservation_template->pr_aptpl_buf_len */ +#define PR_APTPL_BUF_LEN 8192 + u32 pr_aptpl_buf_len; + u32 pr_generation; + t10_reservations_index_t res_type; + spinlock_t registration_lock; + spinlock_t aptpl_reg_lock; + /* + * This will always be set by one individual I_T Nexus. + * However with all_tg_pt=1, other I_T Nexus from the + * same initiator can access PR reg/res info on a different + * target port. + * + * There is also the 'All Registrants' case, where there is + * a single *pr_res_holder of the reservation, but all + * registrations are considered reservation holders. + */ + struct se_node_acl *pr_res_holder; + struct list_head registration_list; + struct list_head aptpl_reg_list; + struct t10_reservation_ops pr_ops; +} ____cacheline_aligned; + +struct se_queue_req { + int state; + void *cmd; + struct list_head qr_list; +} ____cacheline_aligned; + +struct se_queue_obj { + atomic_t queue_cnt; + spinlock_t cmd_queue_lock; + struct list_head qobj_list; + wait_queue_head_t thread_wq; +} ____cacheline_aligned; + +/* + * Used one per struct se_cmd to hold all extra struct se_task + * metadata. This structure is setup and allocated in + * drivers/target/target_core_transport.c:__transport_alloc_se_cmd() + */ +struct se_transport_task { + unsigned char *t_task_cdb; + unsigned char __t_task_cdb[TCM_MAX_COMMAND_SIZE]; + unsigned long long t_task_lba; + int t_tasks_failed; + int t_tasks_fua; + int t_tasks_bidi:1; + u32 t_task_cdbs; + u32 t_tasks_check; + u32 t_tasks_no; + u32 t_tasks_sectors; + u32 t_tasks_se_num; + u32 t_tasks_se_bidi_num; + u32 t_tasks_sg_chained_no; + atomic_t t_fe_count; + atomic_t t_se_count; + atomic_t t_task_cdbs_left; + atomic_t t_task_cdbs_ex_left; + atomic_t t_task_cdbs_timeout_left; + atomic_t t_task_cdbs_sent; + atomic_t t_transport_aborted; + atomic_t t_transport_active; + atomic_t t_transport_complete; + atomic_t t_transport_queue_active; + atomic_t t_transport_sent; + atomic_t t_transport_stop; + atomic_t t_transport_timeout; + atomic_t transport_dev_active; + atomic_t transport_lun_active; + atomic_t transport_lun_fe_stop; + atomic_t transport_lun_stop; + spinlock_t t_state_lock; + struct completion t_transport_stop_comp; + struct completion transport_lun_fe_stop_comp; + struct completion transport_lun_stop_comp; + struct scatterlist *t_tasks_sg_chained; + struct scatterlist t_tasks_sg_bounce; + void *t_task_buf; + /* + * Used for pre-registered fabric SGL passthrough WRITE and READ + * with the special SCF_PASSTHROUGH_CONTIG_TO_SG case for TCM_Loop + * and other HW target mode fabric modules. + */ + struct scatterlist *t_task_pt_sgl; + struct list_head *t_mem_list; + /* Used for BIDI READ */ + struct list_head *t_mem_bidi_list; + struct list_head t_task_list; +} ____cacheline_aligned; + +struct se_task { + unsigned char task_sense; + struct scatterlist *task_sg; + struct scatterlist *task_sg_bidi; + u8 task_scsi_status; + u8 task_flags; + int task_error_status; + int task_state_flags; + int task_padded_sg:1; + unsigned long long task_lba; + u32 task_no; + u32 task_sectors; + u32 task_size; + u32 task_sg_num; + u32 task_sg_offset; + enum dma_data_direction task_data_direction; + struct se_cmd *task_se_cmd; + struct se_device *se_dev; + struct completion task_stop_comp; + atomic_t task_active; + atomic_t task_execute_queue; + atomic_t task_timeout; + atomic_t task_sent; + atomic_t task_stop; + atomic_t task_state_active; + struct timer_list task_timer; + struct se_device *se_obj_ptr; + struct list_head t_list; + struct list_head t_execute_list; + struct list_head t_state_list; +} ____cacheline_aligned; + +#define TASK_CMD(task) ((struct se_cmd *)task->task_se_cmd) +#define TASK_DEV(task) ((struct se_device *)task->se_dev) + +struct se_cmd { + /* SAM response code being sent to initiator */ + u8 scsi_status; + u8 scsi_asc; + u8 scsi_ascq; + u8 scsi_sense_reason; + u16 scsi_sense_length; + /* Delay for ALUA Active/NonOptimized state access in milliseconds */ + int alua_nonop_delay; + /* See include/linux/dma-mapping.h */ + enum dma_data_direction data_direction; + /* For SAM Task Attribute */ + int sam_task_attr; + /* Transport protocol dependent state, see transport_state_table */ + enum transport_state_table t_state; + /* Transport protocol dependent state for out of order CmdSNs */ + int deferred_t_state; + /* Transport specific error status */ + int transport_error_status; + /* See se_cmd_flags_table */ + u32 se_cmd_flags; + u32 se_ordered_id; + /* Total size in bytes associated with command */ + u32 data_length; + /* SCSI Presented Data Transfer Length */ + u32 cmd_spdtl; + u32 residual_count; + u32 orig_fe_lun; + /* Persistent Reservation key */ + u64 pr_res_key; + atomic_t transport_sent; + /* Used for sense data */ + void *sense_buffer; + struct list_head se_delayed_list; + struct list_head se_ordered_list; + struct list_head se_lun_list; + struct se_device *se_dev; + struct se_dev_entry *se_deve; + struct se_device *se_obj_ptr; + struct se_device *se_orig_obj_ptr; + struct se_lun *se_lun; + /* Only used for internal passthrough and legacy TCM fabric modules */ + struct se_session *se_sess; + struct se_tmr_req *se_tmr_req; + /* t_task is setup to t_task_backstore in transport_init_se_cmd() */ + struct se_transport_task *t_task; + struct se_transport_task t_task_backstore; + struct target_core_fabric_ops *se_tfo; + int (*transport_emulate_cdb)(struct se_cmd *); + void (*transport_split_cdb)(unsigned long long, u32 *, unsigned char *); + void (*transport_wait_for_tasks)(struct se_cmd *, int, int); + void (*transport_complete_callback)(struct se_cmd *); +} ____cacheline_aligned; + +#define T_TASK(cmd) ((struct se_transport_task *)(cmd->t_task)) +#define CMD_TFO(cmd) ((struct target_core_fabric_ops *)cmd->se_tfo) + +struct se_tmr_req { + /* Task Management function to be preformed */ + u8 function; + /* Task Management response to send */ + u8 response; + int call_transport; + /* Reference to ITT that Task Mgmt should be preformed */ + u32 ref_task_tag; + /* 64-bit encoded SAM LUN from $FABRIC_MOD TMR header */ + u64 ref_task_lun; + void *fabric_tmr_ptr; + struct se_cmd *task_cmd; + struct se_cmd *ref_cmd; + struct se_device *tmr_dev; + struct se_lun *tmr_lun; + struct list_head tmr_list; +} ____cacheline_aligned; + +struct se_ua { + u8 ua_asc; + u8 ua_ascq; + struct se_node_acl *ua_nacl; + struct list_head ua_dev_list; + struct list_head ua_nacl_list; +} ____cacheline_aligned; + +struct se_node_acl { + char initiatorname[TRANSPORT_IQN_LEN]; + /* Used to signal demo mode created ACL, disabled by default */ + int dynamic_node_acl:1; + u32 queue_depth; + u32 acl_index; + u64 num_cmds; + u64 read_bytes; + u64 write_bytes; + spinlock_t stats_lock; + /* Used for PR SPEC_I_PT=1 and REGISTER_AND_MOVE */ + atomic_t acl_pr_ref_count; + /* Used for MIB access */ + atomic_t mib_ref_count; + struct se_dev_entry *device_list; + struct se_session *nacl_sess; + struct se_portal_group *se_tpg; + spinlock_t device_list_lock; + spinlock_t nacl_sess_lock; + struct config_group acl_group; + struct config_group acl_attrib_group; + struct config_group acl_auth_group; + struct config_group acl_param_group; + struct config_group *acl_default_groups[4]; + struct list_head acl_list; + struct list_head acl_sess_list; +} ____cacheline_aligned; + +struct se_session { + /* Used for MIB access */ + atomic_t mib_ref_count; + u64 sess_bin_isid; + struct se_node_acl *se_node_acl; + struct se_portal_group *se_tpg; + void *fabric_sess_ptr; + struct list_head sess_list; + struct list_head sess_acl_list; +} ____cacheline_aligned; + +#define SE_SESS(cmd) ((struct se_session *)(cmd)->se_sess) +#define SE_NODE_ACL(sess) ((struct se_node_acl *)(sess)->se_node_acl) + +struct se_device; +struct se_transform_info; +struct scatterlist; + +struct se_lun_acl { + char initiatorname[TRANSPORT_IQN_LEN]; + u32 mapped_lun; + struct se_node_acl *se_lun_nacl; + struct se_lun *se_lun; + struct list_head lacl_list; + struct config_group se_lun_group; +} ____cacheline_aligned; + +struct se_dev_entry { + int def_pr_registered:1; + /* See transport_lunflags_table */ + u32 lun_flags; + u32 deve_cmds; + u32 mapped_lun; + u32 average_bytes; + u32 last_byte_count; + u32 total_cmds; + u32 total_bytes; + u64 pr_res_key; + u64 creation_time; + u32 attach_count; + u64 read_bytes; + u64 write_bytes; + atomic_t ua_count; + /* Used for PR SPEC_I_PT=1 and REGISTER_AND_MOVE */ + atomic_t pr_ref_count; + struct se_lun_acl *se_lun_acl; + spinlock_t ua_lock; + struct se_lun *se_lun; + struct list_head alua_port_list; + struct list_head ua_list; +} ____cacheline_aligned; + +struct se_dev_limits { + /* Max supported HW queue depth */ + u32 hw_queue_depth; + /* Max supported virtual queue depth */ + u32 queue_depth; + /* From include/linux/blkdev.h for the other HW/SW limits. */ + struct queue_limits limits; +} ____cacheline_aligned; + +struct se_dev_attrib { + int emulate_dpo; + int emulate_fua_write; + int emulate_fua_read; + int emulate_write_cache; + int emulate_ua_intlck_ctrl; + int emulate_tas; + int emulate_tpu; + int emulate_tpws; + int emulate_reservations; + int emulate_alua; + int enforce_pr_isids; + u32 hw_block_size; + u32 block_size; + u32 hw_max_sectors; + u32 max_sectors; + u32 optimal_sectors; + u32 hw_queue_depth; + u32 queue_depth; + u32 task_timeout; + u32 max_unmap_lba_count; + u32 max_unmap_block_desc_count; + u32 unmap_granularity; + u32 unmap_granularity_alignment; + struct se_subsystem_dev *da_sub_dev; + struct config_group da_group; +} ____cacheline_aligned; + +struct se_subsystem_dev { +/* Used for struct se_subsystem_dev-->se_dev_alias, must be less than PAGE_SIZE */ +#define SE_DEV_ALIAS_LEN 512 + unsigned char se_dev_alias[SE_DEV_ALIAS_LEN]; +/* Used for struct se_subsystem_dev->se_dev_udev_path[], must be less than PAGE_SIZE */ +#define SE_UDEV_PATH_LEN 512 + unsigned char se_dev_udev_path[SE_UDEV_PATH_LEN]; + u32 su_dev_flags; + struct se_hba *se_dev_hba; + struct se_device *se_dev_ptr; + struct se_dev_attrib se_dev_attrib; + /* T10 Asymmetric Logical Unit Assignment for Target Ports */ + struct t10_alua t10_alua; + /* T10 Inquiry and VPD WWN Information */ + struct t10_wwn t10_wwn; + /* T10 SPC-2 + SPC-3 Reservations */ + struct t10_reservation_template t10_reservation; + spinlock_t se_dev_lock; + void *se_dev_su_ptr; + struct list_head g_se_dev_list; + struct config_group se_dev_group; + /* For T10 Reservations */ + struct config_group se_dev_pr_group; +} ____cacheline_aligned; + +#define T10_ALUA(su_dev) (&(su_dev)->t10_alua) +#define T10_RES(su_dev) (&(su_dev)->t10_reservation) +#define T10_PR_OPS(su_dev) (&(su_dev)->t10_reservation.pr_ops) + +struct se_device { + /* Set to 1 if thread is NOT sleeping on thread_sem */ + u8 thread_active; + u8 dev_status_timer_flags; + /* RELATIVE TARGET PORT IDENTIFER Counter */ + u16 dev_rpti_counter; + /* Used for SAM Task Attribute ordering */ + u32 dev_cur_ordered_id; + u32 dev_flags; + u32 dev_port_count; + /* See transport_device_status_table */ + u32 dev_status; + u32 dev_tcq_window_closed; + /* Physical device queue depth */ + u32 queue_depth; + /* Used for SPC-2 reservations enforce of ISIDs */ + u64 dev_res_bin_isid; + t10_task_attr_index_t dev_task_attr_type; + /* Pointer to transport specific device structure */ + void *dev_ptr; + u32 dev_index; + u64 creation_time; + u32 num_resets; + u64 num_cmds; + u64 read_bytes; + u64 write_bytes; + spinlock_t stats_lock; + /* Active commands on this virtual SE device */ + atomic_t active_cmds; + atomic_t simple_cmds; + atomic_t depth_left; + atomic_t dev_ordered_id; + atomic_t dev_tur_active; + atomic_t execute_tasks; + atomic_t dev_status_thr_count; + atomic_t dev_hoq_count; + atomic_t dev_ordered_sync; + struct se_obj dev_obj; + struct se_obj dev_access_obj; + struct se_obj dev_export_obj; + struct se_queue_obj *dev_queue_obj; + struct se_queue_obj *dev_status_queue_obj; + spinlock_t delayed_cmd_lock; + spinlock_t ordered_cmd_lock; + spinlock_t execute_task_lock; + spinlock_t state_task_lock; + spinlock_t dev_alua_lock; + spinlock_t dev_reservation_lock; + spinlock_t dev_state_lock; + spinlock_t dev_status_lock; + spinlock_t dev_status_thr_lock; + spinlock_t se_port_lock; + spinlock_t se_tmr_lock; + /* Used for legacy SPC-2 reservationsa */ + struct se_node_acl *dev_reserved_node_acl; + /* Used for ALUA Logical Unit Group membership */ + struct t10_alua_lu_gp_member *dev_alua_lu_gp_mem; + /* Used for SPC-3 Persistent Reservations */ + struct t10_pr_registration *dev_pr_res_holder; + struct list_head dev_sep_list; + struct list_head dev_tmr_list; + struct timer_list dev_status_timer; + /* Pointer to descriptor for processing thread */ + struct task_struct *process_thread; + pid_t process_thread_pid; + struct task_struct *dev_mgmt_thread; + struct list_head delayed_cmd_list; + struct list_head ordered_cmd_list; + struct list_head execute_task_list; + struct list_head state_task_list; + /* Pointer to associated SE HBA */ + struct se_hba *se_hba; + struct se_subsystem_dev *se_sub_dev; + /* Pointer to template of function pointers for transport */ + struct se_subsystem_api *transport; + /* Linked list for struct se_hba struct se_device list */ + struct list_head dev_list; + /* Linked list for struct se_global->g_se_dev_list */ + struct list_head g_se_dev_list; +} ____cacheline_aligned; + +#define SE_DEV(cmd) ((struct se_device *)(cmd)->se_lun->lun_se_dev) +#define SU_DEV(dev) ((struct se_subsystem_dev *)(dev)->se_sub_dev) +#define DEV_ATTRIB(dev) (&(dev)->se_sub_dev->se_dev_attrib) +#define DEV_T10_WWN(dev) (&(dev)->se_sub_dev->t10_wwn) + +struct se_hba { + u16 hba_tpgt; + u32 hba_id; + /* See hba_flags_table */ + u32 hba_flags; + /* Virtual iSCSI devices attached. */ + u32 dev_count; + u32 hba_index; + atomic_t dev_mib_access_count; + atomic_t load_balance_queue; + atomic_t left_queue_depth; + /* Maximum queue depth the HBA can handle. */ + atomic_t max_queue_depth; + /* Pointer to transport specific host structure. */ + void *hba_ptr; + /* Linked list for struct se_device */ + struct list_head hba_dev_list; + struct list_head hba_list; + spinlock_t device_lock; + spinlock_t hba_queue_lock; + struct config_group hba_group; + struct mutex hba_access_mutex; + struct se_subsystem_api *transport; +} ____cacheline_aligned; + +#define SE_HBA(d) ((struct se_hba *)(d)->se_hba) + +struct se_lun { + /* See transport_lun_status_table */ + enum transport_lun_status_table lun_status; + u32 lun_access; + u32 lun_flags; + u32 unpacked_lun; + atomic_t lun_acl_count; + spinlock_t lun_acl_lock; + spinlock_t lun_cmd_lock; + spinlock_t lun_sep_lock; + struct completion lun_shutdown_comp; + struct list_head lun_cmd_list; + struct list_head lun_acl_list; + struct se_device *lun_se_dev; + struct config_group lun_group; + struct se_port *lun_sep; +} ____cacheline_aligned; + +#define SE_LUN(c) ((struct se_lun *)(c)->se_lun) + +struct se_port { + /* RELATIVE TARGET PORT IDENTIFER */ + u16 sep_rtpi; + int sep_tg_pt_secondary_stat; + int sep_tg_pt_secondary_write_md; + u32 sep_index; + struct scsi_port_stats sep_stats; + /* Used for ALUA Target Port Groups membership */ + atomic_t sep_tg_pt_gp_active; + atomic_t sep_tg_pt_secondary_offline; + /* Used for PR ALL_TG_PT=1 */ + atomic_t sep_tg_pt_ref_cnt; + spinlock_t sep_alua_lock; + struct mutex sep_tg_pt_md_mutex; + struct t10_alua_tg_pt_gp_member *sep_alua_tg_pt_gp_mem; + struct se_lun *sep_lun; + struct se_portal_group *sep_tpg; + struct list_head sep_alua_list; + struct list_head sep_list; +} ____cacheline_aligned; + +struct se_tpg_np { + struct config_group tpg_np_group; +} ____cacheline_aligned; + +struct se_portal_group { + /* Type of target portal group, see transport_tpg_type_table */ + enum transport_tpg_type_table se_tpg_type; + /* Number of ACLed Initiator Nodes for this TPG */ + u32 num_node_acls; + /* Used for PR SPEC_I_PT=1 and REGISTER_AND_MOVE */ + atomic_t tpg_pr_ref_count; + /* Spinlock for adding/removing ACLed Nodes */ + spinlock_t acl_node_lock; + /* Spinlock for adding/removing sessions */ + spinlock_t session_lock; + spinlock_t tpg_lun_lock; + /* Pointer to $FABRIC_MOD portal group */ + void *se_tpg_fabric_ptr; + struct list_head se_tpg_list; + /* linked list for initiator ACL list */ + struct list_head acl_node_list; + struct se_lun *tpg_lun_list; + struct se_lun tpg_virt_lun0; + /* List of TCM sessions assoicated wth this TPG */ + struct list_head tpg_sess_list; + /* Pointer to $FABRIC_MOD dependent code */ + struct target_core_fabric_ops *se_tpg_tfo; + struct se_wwn *se_tpg_wwn; + struct config_group tpg_group; + struct config_group *tpg_default_groups[6]; + struct config_group tpg_lun_group; + struct config_group tpg_np_group; + struct config_group tpg_acl_group; + struct config_group tpg_attrib_group; + struct config_group tpg_param_group; +} ____cacheline_aligned; + +#define TPG_TFO(se_tpg) ((struct target_core_fabric_ops *)(se_tpg)->se_tpg_tfo) + +struct se_wwn { + struct target_fabric_configfs *wwn_tf; + struct config_group wwn_group; +} ____cacheline_aligned; + +struct se_global { + u16 alua_lu_gps_counter; + int g_sub_api_initialized; + u32 in_shutdown; + u32 alua_lu_gps_count; + u32 g_hba_id_counter; + struct config_group target_core_hbagroup; + struct config_group alua_group; + struct config_group alua_lu_gps_group; + struct list_head g_lu_gps_list; + struct list_head g_se_tpg_list; + struct list_head g_hba_list; + struct list_head g_se_dev_list; + struct se_hba *g_lun0_hba; + struct se_subsystem_dev *g_lun0_su_dev; + struct se_device *g_lun0_dev; + struct t10_alua_lu_gp *default_lu_gp; + spinlock_t g_device_lock; + spinlock_t hba_lock; + spinlock_t se_tpg_lock; + spinlock_t lu_gps_lock; + spinlock_t plugin_class_lock; +} ____cacheline_aligned; + +#endif /* TARGET_CORE_BASE_H */ diff --git a/include/target/target_core_configfs.h b/include/target/target_core_configfs.h new file mode 100644 index 000000000000..40e6e740527c --- /dev/null +++ b/include/target/target_core_configfs.h @@ -0,0 +1,52 @@ +#define TARGET_CORE_CONFIGFS_VERSION TARGET_CORE_MOD_VERSION + +#define TARGET_CORE_CONFIG_ROOT "/sys/kernel/config" + +#define TARGET_CORE_NAME_MAX_LEN 64 +#define TARGET_FABRIC_NAME_SIZE 32 + +extern struct target_fabric_configfs *target_fabric_configfs_init( + struct module *, const char *); +extern void target_fabric_configfs_free(struct target_fabric_configfs *); +extern int target_fabric_configfs_register(struct target_fabric_configfs *); +extern void target_fabric_configfs_deregister(struct target_fabric_configfs *); + +struct target_fabric_configfs_template { + struct config_item_type tfc_discovery_cit; + struct config_item_type tfc_wwn_cit; + struct config_item_type tfc_tpg_cit; + struct config_item_type tfc_tpg_base_cit; + struct config_item_type tfc_tpg_lun_cit; + struct config_item_type tfc_tpg_port_cit; + struct config_item_type tfc_tpg_np_cit; + struct config_item_type tfc_tpg_np_base_cit; + struct config_item_type tfc_tpg_attrib_cit; + struct config_item_type tfc_tpg_param_cit; + struct config_item_type tfc_tpg_nacl_cit; + struct config_item_type tfc_tpg_nacl_base_cit; + struct config_item_type tfc_tpg_nacl_attrib_cit; + struct config_item_type tfc_tpg_nacl_auth_cit; + struct config_item_type tfc_tpg_nacl_param_cit; + struct config_item_type tfc_tpg_mappedlun_cit; +}; + +struct target_fabric_configfs { + char tf_name[TARGET_FABRIC_NAME_SIZE]; + atomic_t tf_access_cnt; + struct list_head tf_list; + struct config_group tf_group; + struct config_group tf_disc_group; + struct config_group *tf_default_groups[2]; + /* Pointer to fabric's config_item */ + struct config_item *tf_fabric; + /* Passed from fabric modules */ + struct config_item_type *tf_fabric_cit; + /* Pointer to target core subsystem */ + struct configfs_subsystem *tf_subsys; + /* Pointer to fabric's struct module */ + struct module *tf_module; + struct target_core_fabric_ops tf_ops; + struct target_fabric_configfs_template tf_cit_tmpl; +}; + +#define TF_CIT_TMPL(tf) (&(tf)->tf_cit_tmpl) diff --git a/include/target/target_core_device.h b/include/target/target_core_device.h new file mode 100644 index 000000000000..52b18a5752c9 --- /dev/null +++ b/include/target/target_core_device.h @@ -0,0 +1,61 @@ +#ifndef TARGET_CORE_DEVICE_H +#define TARGET_CORE_DEVICE_H + +extern int transport_get_lun_for_cmd(struct se_cmd *, unsigned char *, u32); +extern int transport_get_lun_for_tmr(struct se_cmd *, u32); +extern struct se_dev_entry *core_get_se_deve_from_rtpi( + struct se_node_acl *, u16); +extern int core_free_device_list_for_node(struct se_node_acl *, + struct se_portal_group *); +extern void core_dec_lacl_count(struct se_node_acl *, struct se_cmd *); +extern void core_update_device_list_access(u32, u32, struct se_node_acl *); +extern int core_update_device_list_for_node(struct se_lun *, struct se_lun_acl *, u32, + u32, struct se_node_acl *, + struct se_portal_group *, int); +extern void core_clear_lun_from_tpg(struct se_lun *, struct se_portal_group *); +extern int core_dev_export(struct se_device *, struct se_portal_group *, + struct se_lun *); +extern void core_dev_unexport(struct se_device *, struct se_portal_group *, + struct se_lun *); +extern int transport_core_report_lun_response(struct se_cmd *); +extern void se_release_device_for_hba(struct se_device *); +extern void se_release_vpd_for_dev(struct se_device *); +extern void se_clear_dev_ports(struct se_device *); +extern int se_free_virtual_device(struct se_device *, struct se_hba *); +extern int se_dev_check_online(struct se_device *); +extern int se_dev_check_shutdown(struct se_device *); +extern void se_dev_set_default_attribs(struct se_device *, struct se_dev_limits *); +extern int se_dev_set_task_timeout(struct se_device *, u32); +extern int se_dev_set_max_unmap_lba_count(struct se_device *, u32); +extern int se_dev_set_max_unmap_block_desc_count(struct se_device *, u32); +extern int se_dev_set_unmap_granularity(struct se_device *, u32); +extern int se_dev_set_unmap_granularity_alignment(struct se_device *, u32); +extern int se_dev_set_emulate_dpo(struct se_device *, int); +extern int se_dev_set_emulate_fua_write(struct se_device *, int); +extern int se_dev_set_emulate_fua_read(struct se_device *, int); +extern int se_dev_set_emulate_write_cache(struct se_device *, int); +extern int se_dev_set_emulate_ua_intlck_ctrl(struct se_device *, int); +extern int se_dev_set_emulate_tas(struct se_device *, int); +extern int se_dev_set_emulate_tpu(struct se_device *, int); +extern int se_dev_set_emulate_tpws(struct se_device *, int); +extern int se_dev_set_enforce_pr_isids(struct se_device *, int); +extern int se_dev_set_queue_depth(struct se_device *, u32); +extern int se_dev_set_max_sectors(struct se_device *, u32); +extern int se_dev_set_optimal_sectors(struct se_device *, u32); +extern int se_dev_set_block_size(struct se_device *, u32); +extern struct se_lun *core_dev_add_lun(struct se_portal_group *, struct se_hba *, + struct se_device *, u32); +extern int core_dev_del_lun(struct se_portal_group *, u32); +extern struct se_lun *core_get_lun_from_tpg(struct se_portal_group *, u32); +extern struct se_lun_acl *core_dev_init_initiator_node_lun_acl(struct se_portal_group *, + u32, char *, int *); +extern int core_dev_add_initiator_node_lun_acl(struct se_portal_group *, + struct se_lun_acl *, u32, u32); +extern int core_dev_del_initiator_node_lun_acl(struct se_portal_group *, + struct se_lun *, struct se_lun_acl *); +extern void core_dev_free_initiator_node_lun_acl(struct se_portal_group *, + struct se_lun_acl *lacl); +extern int core_dev_setup_virtual_lun0(void); +extern void core_dev_release_virtual_lun0(void); + +#endif /* TARGET_CORE_DEVICE_H */ diff --git a/include/target/target_core_fabric_configfs.h b/include/target/target_core_fabric_configfs.h new file mode 100644 index 000000000000..a26fb7586a09 --- /dev/null +++ b/include/target/target_core_fabric_configfs.h @@ -0,0 +1,106 @@ +/* + * Used for tfc_wwn_cit attributes + */ + +#include + +CONFIGFS_EATTR_STRUCT(target_fabric_nacl_attrib, se_node_acl); +#define TF_NACL_ATTRIB_ATTR(_fabric, _name, _mode) \ +static struct target_fabric_nacl_attrib_attribute _fabric##_nacl_attrib_##_name = \ + __CONFIGFS_EATTR(_name, _mode, \ + _fabric##_nacl_attrib_show_##_name, \ + _fabric##_nacl_attrib_store_##_name); + +CONFIGFS_EATTR_STRUCT(target_fabric_nacl_auth, se_node_acl); +#define TF_NACL_AUTH_ATTR(_fabric, _name, _mode) \ +static struct target_fabric_nacl_auth_attribute _fabric##_nacl_auth_##_name = \ + __CONFIGFS_EATTR(_name, _mode, \ + _fabric##_nacl_auth_show_##_name, \ + _fabric##_nacl_auth_store_##_name); + +#define TF_NACL_AUTH_ATTR_RO(_fabric, _name) \ +static struct target_fabric_nacl_auth_attribute _fabric##_nacl_auth_##_name = \ + __CONFIGFS_EATTR_RO(_name, \ + _fabric##_nacl_auth_show_##_name); + +CONFIGFS_EATTR_STRUCT(target_fabric_nacl_param, se_node_acl); +#define TF_NACL_PARAM_ATTR(_fabric, _name, _mode) \ +static struct target_fabric_nacl_param_attribute _fabric##_nacl_param_##_name = \ + __CONFIGFS_EATTR(_name, _mode, \ + _fabric##_nacl_param_show_##_name, \ + _fabric##_nacl_param_store_##_name); + +#define TF_NACL_PARAM_ATTR_RO(_fabric, _name) \ +static struct target_fabric_nacl_param_attribute _fabric##_nacl_param_##_name = \ + __CONFIGFS_EATTR_RO(_name, \ + _fabric##_nacl_param_show_##_name); + + +CONFIGFS_EATTR_STRUCT(target_fabric_nacl_base, se_node_acl); +#define TF_NACL_BASE_ATTR(_fabric, _name, _mode) \ +static struct target_fabric_nacl_base_attribute _fabric##_nacl_##_name = \ + __CONFIGFS_EATTR(_name, _mode, \ + _fabric##_nacl_show_##_name, \ + _fabric##_nacl_store_##_name); + +#define TF_NACL_BASE_ATTR_RO(_fabric, _name) \ +static struct target_fabric_nacl_base_attribute _fabric##_nacl_##_name = \ + __CONFIGFS_EATTR_RO(_name, \ + _fabric##_nacl_show_##_name); + +CONFIGFS_EATTR_STRUCT(target_fabric_np_base, se_tpg_np); +#define TF_NP_BASE_ATTR(_fabric, _name, _mode) \ +static struct target_fabric_np_base_attribute _fabric##_np_##_name = \ + __CONFIGFS_EATTR(_name, _mode, \ + _fabric##_np_show_##_name, \ + _fabric##_np_store_##_name); + +CONFIGFS_EATTR_STRUCT(target_fabric_tpg_attrib, se_portal_group); +#define TF_TPG_ATTRIB_ATTR(_fabric, _name, _mode) \ +static struct target_fabric_tpg_attrib_attribute _fabric##_tpg_attrib_##_name = \ + __CONFIGFS_EATTR(_name, _mode, \ + _fabric##_tpg_attrib_show_##_name, \ + _fabric##_tpg_attrib_store_##_name); + + +CONFIGFS_EATTR_STRUCT(target_fabric_tpg_param, se_portal_group); +#define TF_TPG_PARAM_ATTR(_fabric, _name, _mode) \ +static struct target_fabric_tpg_param_attribute _fabric##_tpg_param_##_name = \ + __CONFIGFS_EATTR(_name, _mode, \ + _fabric##_tpg_param_show_##_name, \ + _fabric##_tpg_param_store_##_name); + + +CONFIGFS_EATTR_STRUCT(target_fabric_tpg, se_portal_group); +#define TF_TPG_BASE_ATTR(_fabric, _name, _mode) \ +static struct target_fabric_tpg_attribute _fabric##_tpg_##_name = \ + __CONFIGFS_EATTR(_name, _mode, \ + _fabric##_tpg_show_##_name, \ + _fabric##_tpg_store_##_name); + + +CONFIGFS_EATTR_STRUCT(target_fabric_wwn, target_fabric_configfs); +#define TF_WWN_ATTR(_fabric, _name, _mode) \ +static struct target_fabric_wwn_attribute _fabric##_wwn_##_name = \ + __CONFIGFS_EATTR(_name, _mode, \ + _fabric##_wwn_show_attr_##_name, \ + _fabric##_wwn_store_attr_##_name); + +#define TF_WWN_ATTR_RO(_fabric, _name) \ +static struct target_fabric_wwn_attribute _fabric##_wwn_##_name = \ + __CONFIGFS_EATTR_RO(_name, \ + _fabric##_wwn_show_attr_##_name); + +CONFIGFS_EATTR_STRUCT(target_fabric_discovery, target_fabric_configfs); +#define TF_DISC_ATTR(_fabric, _name, _mode) \ +static struct target_fabric_discovery_attribute _fabric##_disc_##_name = \ + __CONFIGFS_EATTR(_name, _mode, \ + _fabric##_disc_show_##_name, \ + _fabric##_disc_store_##_name); + +#define TF_DISC_ATTR_RO(_fabric, _name) \ +static struct target_fabric_discovery_attribute _fabric##_disc_##_name = \ + __CONFIGFS_EATTR_RO(_name, \ + _fabric##_disc_show_##_name); + +extern int target_fabric_setup_cits(struct target_fabric_configfs *); diff --git a/include/target/target_core_fabric_lib.h b/include/target/target_core_fabric_lib.h new file mode 100644 index 000000000000..c2f8d0e3a03b --- /dev/null +++ b/include/target/target_core_fabric_lib.h @@ -0,0 +1,28 @@ +#ifndef TARGET_CORE_FABRIC_LIB_H +#define TARGET_CORE_FABRIC_LIB_H + +extern u8 sas_get_fabric_proto_ident(struct se_portal_group *); +extern u32 sas_get_pr_transport_id(struct se_portal_group *, struct se_node_acl *, + struct t10_pr_registration *, int *, unsigned char *); +extern u32 sas_get_pr_transport_id_len(struct se_portal_group *, struct se_node_acl *, + struct t10_pr_registration *, int *); +extern char *sas_parse_pr_out_transport_id(struct se_portal_group *, + const char *, u32 *, char **); + +extern u8 fc_get_fabric_proto_ident(struct se_portal_group *); +extern u32 fc_get_pr_transport_id(struct se_portal_group *, struct se_node_acl *, + struct t10_pr_registration *, int *, unsigned char *); +extern u32 fc_get_pr_transport_id_len(struct se_portal_group *, struct se_node_acl *, + struct t10_pr_registration *, int *); +extern char *fc_parse_pr_out_transport_id(struct se_portal_group *, + const char *, u32 *, char **); + +extern u8 iscsi_get_fabric_proto_ident(struct se_portal_group *); +extern u32 iscsi_get_pr_transport_id(struct se_portal_group *, struct se_node_acl *, + struct t10_pr_registration *, int *, unsigned char *); +extern u32 iscsi_get_pr_transport_id_len(struct se_portal_group *, struct se_node_acl *, + struct t10_pr_registration *, int *); +extern char *iscsi_parse_pr_out_transport_id(struct se_portal_group *, + const char *, u32 *, char **); + +#endif /* TARGET_CORE_FABRIC_LIB_H */ diff --git a/include/target/target_core_fabric_ops.h b/include/target/target_core_fabric_ops.h new file mode 100644 index 000000000000..f3ac12b019c2 --- /dev/null +++ b/include/target/target_core_fabric_ops.h @@ -0,0 +1,100 @@ +/* Defined in target_core_configfs.h */ +struct target_fabric_configfs; + +struct target_core_fabric_ops { + struct configfs_subsystem *tf_subsys; + /* + * Optional to signal struct se_task->task_sg[] padding entries + * for scatterlist chaining using transport_do_task_sg_link(), + * disabled by default + */ + int task_sg_chaining:1; + char *(*get_fabric_name)(void); + u8 (*get_fabric_proto_ident)(struct se_portal_group *); + char *(*tpg_get_wwn)(struct se_portal_group *); + u16 (*tpg_get_tag)(struct se_portal_group *); + u32 (*tpg_get_default_depth)(struct se_portal_group *); + u32 (*tpg_get_pr_transport_id)(struct se_portal_group *, + struct se_node_acl *, + struct t10_pr_registration *, int *, + unsigned char *); + u32 (*tpg_get_pr_transport_id_len)(struct se_portal_group *, + struct se_node_acl *, + struct t10_pr_registration *, int *); + char *(*tpg_parse_pr_out_transport_id)(struct se_portal_group *, + const char *, u32 *, char **); + int (*tpg_check_demo_mode)(struct se_portal_group *); + int (*tpg_check_demo_mode_cache)(struct se_portal_group *); + int (*tpg_check_demo_mode_write_protect)(struct se_portal_group *); + int (*tpg_check_prod_mode_write_protect)(struct se_portal_group *); + struct se_node_acl *(*tpg_alloc_fabric_acl)( + struct se_portal_group *); + void (*tpg_release_fabric_acl)(struct se_portal_group *, + struct se_node_acl *); + u32 (*tpg_get_inst_index)(struct se_portal_group *); + /* + * Optional function pointer for TCM to perform command map + * from TCM processing thread context, for those struct se_cmd + * initally allocated in interrupt context. + */ + int (*new_cmd_map)(struct se_cmd *); + /* + * Optional function pointer for TCM fabric modules that use + * Linux/NET sockets to allocate struct iovec array to struct se_cmd + */ + int (*alloc_cmd_iovecs)(struct se_cmd *); + /* + * Optional to release struct se_cmd and fabric dependent allocated + * I/O descriptor in transport_cmd_check_stop() + */ + void (*check_stop_free)(struct se_cmd *); + void (*release_cmd_to_pool)(struct se_cmd *); + void (*release_cmd_direct)(struct se_cmd *); + /* + * Called with spin_lock_bh(struct se_portal_group->session_lock held. + */ + int (*shutdown_session)(struct se_session *); + void (*close_session)(struct se_session *); + void (*stop_session)(struct se_session *, int, int); + void (*fall_back_to_erl0)(struct se_session *); + int (*sess_logged_in)(struct se_session *); + u32 (*sess_get_index)(struct se_session *); + /* + * Used only for SCSI fabrics that contain multi-value TransportIDs + * (like iSCSI). All other SCSI fabrics should set this to NULL. + */ + u32 (*sess_get_initiator_sid)(struct se_session *, + unsigned char *, u32); + int (*write_pending)(struct se_cmd *); + int (*write_pending_status)(struct se_cmd *); + void (*set_default_node_attributes)(struct se_node_acl *); + u32 (*get_task_tag)(struct se_cmd *); + int (*get_cmd_state)(struct se_cmd *); + void (*new_cmd_failure)(struct se_cmd *); + int (*queue_data_in)(struct se_cmd *); + int (*queue_status)(struct se_cmd *); + int (*queue_tm_rsp)(struct se_cmd *); + u16 (*set_fabric_sense_len)(struct se_cmd *, u32); + u16 (*get_fabric_sense_len)(void); + int (*is_state_remove)(struct se_cmd *); + u64 (*pack_lun)(unsigned int); + /* + * fabric module calls for target_core_fabric_configfs.c + */ + struct se_wwn *(*fabric_make_wwn)(struct target_fabric_configfs *, + struct config_group *, const char *); + void (*fabric_drop_wwn)(struct se_wwn *); + struct se_portal_group *(*fabric_make_tpg)(struct se_wwn *, + struct config_group *, const char *); + void (*fabric_drop_tpg)(struct se_portal_group *); + int (*fabric_post_link)(struct se_portal_group *, + struct se_lun *); + void (*fabric_pre_unlink)(struct se_portal_group *, + struct se_lun *); + struct se_tpg_np *(*fabric_make_np)(struct se_portal_group *, + struct config_group *, const char *); + void (*fabric_drop_np)(struct se_tpg_np *); + struct se_node_acl *(*fabric_make_nodeacl)(struct se_portal_group *, + struct config_group *, const char *); + void (*fabric_drop_nodeacl)(struct se_node_acl *); +}; diff --git a/include/target/target_core_tmr.h b/include/target/target_core_tmr.h new file mode 100644 index 000000000000..6c8248bc2c66 --- /dev/null +++ b/include/target/target_core_tmr.h @@ -0,0 +1,43 @@ +#ifndef TARGET_CORE_TMR_H +#define TARGET_CORE_TMR_H + +/* task management function values */ +#ifdef ABORT_TASK +#undef ABORT_TASK +#endif /* ABORT_TASK */ +#define ABORT_TASK 1 +#ifdef ABORT_TASK_SET +#undef ABORT_TASK_SET +#endif /* ABORT_TASK_SET */ +#define ABORT_TASK_SET 2 +#ifdef CLEAR_ACA +#undef CLEAR_ACA +#endif /* CLEAR_ACA */ +#define CLEAR_ACA 3 +#ifdef CLEAR_TASK_SET +#undef CLEAR_TASK_SET +#endif /* CLEAR_TASK_SET */ +#define CLEAR_TASK_SET 4 +#define LUN_RESET 5 +#define TARGET_WARM_RESET 6 +#define TARGET_COLD_RESET 7 +#define TASK_REASSIGN 8 + +/* task management response values */ +#define TMR_FUNCTION_COMPLETE 0 +#define TMR_TASK_DOES_NOT_EXIST 1 +#define TMR_LUN_DOES_NOT_EXIST 2 +#define TMR_TASK_STILL_ALLEGIANT 3 +#define TMR_TASK_FAILOVER_NOT_SUPPORTED 4 +#define TMR_TASK_MGMT_FUNCTION_NOT_SUPPORTED 5 +#define TMR_FUNCTION_AUTHORIZATION_FAILED 6 +#define TMR_FUNCTION_REJECTED 255 + +extern struct kmem_cache *se_tmr_req_cache; + +extern struct se_tmr_req *core_tmr_alloc_req(struct se_cmd *, void *, u8); +extern void core_tmr_release_req(struct se_tmr_req *); +extern int core_tmr_lun_reset(struct se_device *, struct se_tmr_req *, + struct list_head *, struct se_cmd *); + +#endif /* TARGET_CORE_TMR_H */ diff --git a/include/target/target_core_tpg.h b/include/target/target_core_tpg.h new file mode 100644 index 000000000000..77e18729c4c1 --- /dev/null +++ b/include/target/target_core_tpg.h @@ -0,0 +1,35 @@ +#ifndef TARGET_CORE_TPG_H +#define TARGET_CORE_TPG_H + +extern struct se_node_acl *__core_tpg_get_initiator_node_acl(struct se_portal_group *tpg, + const char *); +extern struct se_node_acl *core_tpg_get_initiator_node_acl(struct se_portal_group *tpg, + unsigned char *); +extern void core_tpg_add_node_to_devs(struct se_node_acl *, + struct se_portal_group *); +extern struct se_node_acl *core_tpg_check_initiator_node_acl( + struct se_portal_group *, + unsigned char *); +extern void core_tpg_wait_for_nacl_pr_ref(struct se_node_acl *); +extern void core_tpg_wait_for_mib_ref(struct se_node_acl *); +extern void core_tpg_clear_object_luns(struct se_portal_group *); +extern struct se_node_acl *core_tpg_add_initiator_node_acl( + struct se_portal_group *, + struct se_node_acl *, + const char *, u32); +extern int core_tpg_del_initiator_node_acl(struct se_portal_group *, + struct se_node_acl *, int); +extern int core_tpg_set_initiator_node_queue_depth(struct se_portal_group *, + unsigned char *, u32, int); +extern int core_tpg_register(struct target_core_fabric_ops *, + struct se_wwn *, + struct se_portal_group *, void *, + int); +extern int core_tpg_deregister(struct se_portal_group *); +extern struct se_lun *core_tpg_pre_addlun(struct se_portal_group *, u32); +extern int core_tpg_post_addlun(struct se_portal_group *, struct se_lun *, u32, + void *); +extern struct se_lun *core_tpg_pre_dellun(struct se_portal_group *, u32, int *); +extern int core_tpg_post_dellun(struct se_portal_group *, struct se_lun *); + +#endif /* TARGET_CORE_TPG_H */ diff --git a/include/target/target_core_transport.h b/include/target/target_core_transport.h new file mode 100644 index 000000000000..66f44e56eb80 --- /dev/null +++ b/include/target/target_core_transport.h @@ -0,0 +1,351 @@ +#ifndef TARGET_CORE_TRANSPORT_H +#define TARGET_CORE_TRANSPORT_H + +#define TARGET_CORE_VERSION TARGET_CORE_MOD_VERSION + +/* Attempts before moving from SHORT to LONG */ +#define PYX_TRANSPORT_WINDOW_CLOSED_THRESHOLD 3 +#define PYX_TRANSPORT_WINDOW_CLOSED_WAIT_SHORT 3 /* In milliseconds */ +#define PYX_TRANSPORT_WINDOW_CLOSED_WAIT_LONG 10 /* In milliseconds */ + +#define PYX_TRANSPORT_STATUS_INTERVAL 5 /* In seconds */ + +#define PYX_TRANSPORT_SENT_TO_TRANSPORT 0 +#define PYX_TRANSPORT_WRITE_PENDING 1 + +#define PYX_TRANSPORT_UNKNOWN_SAM_OPCODE -1 +#define PYX_TRANSPORT_HBA_QUEUE_FULL -2 +#define PYX_TRANSPORT_REQ_TOO_MANY_SECTORS -3 +#define PYX_TRANSPORT_OUT_OF_MEMORY_RESOURCES -4 +#define PYX_TRANSPORT_INVALID_CDB_FIELD -5 +#define PYX_TRANSPORT_INVALID_PARAMETER_LIST -6 +#define PYX_TRANSPORT_LU_COMM_FAILURE -7 +#define PYX_TRANSPORT_UNKNOWN_MODE_PAGE -8 +#define PYX_TRANSPORT_WRITE_PROTECTED -9 +#define PYX_TRANSPORT_TASK_TIMEOUT -10 +#define PYX_TRANSPORT_RESERVATION_CONFLICT -11 +#define PYX_TRANSPORT_ILLEGAL_REQUEST -12 +#define PYX_TRANSPORT_USE_SENSE_REASON -13 + +#ifndef SAM_STAT_RESERVATION_CONFLICT +#define SAM_STAT_RESERVATION_CONFLICT 0x18 +#endif + +#define TRANSPORT_PLUGIN_FREE 0 +#define TRANSPORT_PLUGIN_REGISTERED 1 + +#define TRANSPORT_PLUGIN_PHBA_PDEV 1 +#define TRANSPORT_PLUGIN_VHBA_PDEV 2 +#define TRANSPORT_PLUGIN_VHBA_VDEV 3 + +/* For SE OBJ Plugins, in seconds */ +#define TRANSPORT_TIMEOUT_TUR 10 +#define TRANSPORT_TIMEOUT_TYPE_DISK 60 +#define TRANSPORT_TIMEOUT_TYPE_ROM 120 +#define TRANSPORT_TIMEOUT_TYPE_TAPE 600 +#define TRANSPORT_TIMEOUT_TYPE_OTHER 300 + +/* For se_task->task_state_flags */ +#define TSF_EXCEPTION_CLEARED 0x01 + +/* + * struct se_subsystem_dev->su_dev_flags +*/ +#define SDF_FIRMWARE_VPD_UNIT_SERIAL 0x00000001 +#define SDF_EMULATED_VPD_UNIT_SERIAL 0x00000002 +#define SDF_USING_UDEV_PATH 0x00000004 +#define SDF_USING_ALIAS 0x00000008 + +/* + * struct se_device->dev_flags + */ +#define DF_READ_ONLY 0x00000001 +#define DF_SPC2_RESERVATIONS 0x00000002 +#define DF_SPC2_RESERVATIONS_WITH_ISID 0x00000004 + +/* struct se_dev_attrib sanity values */ +/* 10 Minutes */ +#define DA_TASK_TIMEOUT_MAX 600 +/* Default max_unmap_lba_count */ +#define DA_MAX_UNMAP_LBA_COUNT 0 +/* Default max_unmap_block_desc_count */ +#define DA_MAX_UNMAP_BLOCK_DESC_COUNT 0 +/* Default unmap_granularity */ +#define DA_UNMAP_GRANULARITY_DEFAULT 0 +/* Default unmap_granularity_alignment */ +#define DA_UNMAP_GRANULARITY_ALIGNMENT_DEFAULT 0 +/* Emulation for Direct Page Out */ +#define DA_EMULATE_DPO 0 +/* Emulation for Forced Unit Access WRITEs */ +#define DA_EMULATE_FUA_WRITE 1 +/* Emulation for Forced Unit Access READs */ +#define DA_EMULATE_FUA_READ 0 +/* Emulation for WriteCache and SYNCHRONIZE_CACHE */ +#define DA_EMULATE_WRITE_CACHE 0 +/* Emulation for UNIT ATTENTION Interlock Control */ +#define DA_EMULATE_UA_INTLLCK_CTRL 0 +/* Emulation for TASK_ABORTED status (TAS) by default */ +#define DA_EMULATE_TAS 1 +/* Emulation for Thin Provisioning UNMAP using block/blk-lib.c:blkdev_issue_discard() */ +#define DA_EMULATE_TPU 0 +/* + * Emulation for Thin Provisioning WRITE_SAME w/ UNMAP=1 bit using + * block/blk-lib.c:blkdev_issue_discard() + */ +#define DA_EMULATE_TPWS 0 +/* No Emulation for PSCSI by default */ +#define DA_EMULATE_RESERVATIONS 0 +/* No Emulation for PSCSI by default */ +#define DA_EMULATE_ALUA 0 +/* Enforce SCSI Initiator Port TransportID with 'ISID' for PR */ +#define DA_ENFORCE_PR_ISIDS 1 +#define DA_STATUS_MAX_SECTORS_MIN 16 +#define DA_STATUS_MAX_SECTORS_MAX 8192 + +#define SE_MODE_PAGE_BUF 512 + +#define MOD_MAX_SECTORS(ms, bs) (ms % (PAGE_SIZE / bs)) + +struct se_mem; +struct se_subsystem_api; + +extern int init_se_global(void); +extern void release_se_global(void); +extern void transport_init_queue_obj(struct se_queue_obj *); +extern int transport_subsystem_check_init(void); +extern int transport_subsystem_register(struct se_subsystem_api *); +extern void transport_subsystem_release(struct se_subsystem_api *); +extern void transport_load_plugins(void); +extern struct se_session *transport_init_session(void); +extern void __transport_register_session(struct se_portal_group *, + struct se_node_acl *, + struct se_session *, void *); +extern void transport_register_session(struct se_portal_group *, + struct se_node_acl *, + struct se_session *, void *); +extern void transport_free_session(struct se_session *); +extern void transport_deregister_session_configfs(struct se_session *); +extern void transport_deregister_session(struct se_session *); +extern void transport_cmd_finish_abort(struct se_cmd *, int); +extern void transport_cmd_finish_abort_tmr(struct se_cmd *); +extern void transport_complete_sync_cache(struct se_cmd *, int); +extern void transport_complete_task(struct se_task *, int); +extern void transport_add_task_to_execute_queue(struct se_task *, + struct se_task *, + struct se_device *); +unsigned char *transport_dump_cmd_direction(struct se_cmd *); +extern void transport_dump_dev_state(struct se_device *, char *, int *); +extern void transport_dump_dev_info(struct se_device *, struct se_lun *, + unsigned long long, char *, int *); +extern void transport_dump_vpd_proto_id(struct t10_vpd *, + unsigned char *, int); +extern void transport_set_vpd_proto_id(struct t10_vpd *, unsigned char *); +extern int transport_dump_vpd_assoc(struct t10_vpd *, + unsigned char *, int); +extern int transport_set_vpd_assoc(struct t10_vpd *, unsigned char *); +extern int transport_dump_vpd_ident_type(struct t10_vpd *, + unsigned char *, int); +extern int transport_set_vpd_ident_type(struct t10_vpd *, unsigned char *); +extern int transport_dump_vpd_ident(struct t10_vpd *, + unsigned char *, int); +extern int transport_set_vpd_ident(struct t10_vpd *, unsigned char *); +extern struct se_device *transport_add_device_to_core_hba(struct se_hba *, + struct se_subsystem_api *, + struct se_subsystem_dev *, u32, + void *, struct se_dev_limits *, + const char *, const char *); +extern void transport_device_setup_cmd(struct se_cmd *); +extern void transport_init_se_cmd(struct se_cmd *, + struct target_core_fabric_ops *, + struct se_session *, u32, int, int, + unsigned char *); +extern void transport_free_se_cmd(struct se_cmd *); +extern int transport_generic_allocate_tasks(struct se_cmd *, unsigned char *); +extern int transport_generic_handle_cdb(struct se_cmd *); +extern int transport_generic_handle_cdb_map(struct se_cmd *); +extern int transport_generic_handle_data(struct se_cmd *); +extern void transport_new_cmd_failure(struct se_cmd *); +extern int transport_generic_handle_tmr(struct se_cmd *); +extern void __transport_stop_task_timer(struct se_task *, unsigned long *); +extern unsigned char transport_asciihex_to_binaryhex(unsigned char val[2]); +extern int transport_generic_map_mem_to_cmd(struct se_cmd *cmd, struct scatterlist *, u32, + struct scatterlist *, u32); +extern int transport_clear_lun_from_sessions(struct se_lun *); +extern int transport_check_aborted_status(struct se_cmd *, int); +extern int transport_send_check_condition_and_sense(struct se_cmd *, u8, int); +extern void transport_send_task_abort(struct se_cmd *); +extern void transport_release_cmd_to_pool(struct se_cmd *); +extern void transport_generic_free_cmd(struct se_cmd *, int, int, int); +extern void transport_generic_wait_for_cmds(struct se_cmd *, int); +extern u32 transport_calc_sg_num(struct se_task *, struct se_mem *, u32); +extern int transport_map_mem_to_sg(struct se_task *, struct list_head *, + void *, struct se_mem *, + struct se_mem **, u32 *, u32 *); +extern void transport_do_task_sg_chain(struct se_cmd *); +extern void transport_generic_process_write(struct se_cmd *); +extern int transport_generic_do_tmr(struct se_cmd *); +/* From target_core_alua.c */ +extern int core_alua_check_nonop_delay(struct se_cmd *); + +/* + * Each se_transport_task_t can have N number of possible struct se_task's + * for the storage transport(s) to possibly execute. + * Used primarily for splitting up CDBs that exceed the physical storage + * HBA's maximum sector count per task. + */ +struct se_mem { + struct page *se_page; + u32 se_len; + u32 se_off; + struct list_head se_list; +} ____cacheline_aligned; + +/* + * Each type of disk transport supported MUST have a template defined + * within its .h file. + */ +struct se_subsystem_api { + /* + * The Name. :-) + */ + char name[16]; + /* + * Transport Type. + */ + u8 transport_type; + /* + * struct module for struct se_hba references + */ + struct module *owner; + /* + * Used for global se_subsystem_api list_head + */ + struct list_head sub_api_list; + /* + * For SCF_SCSI_NON_DATA_CDB + */ + int (*cdb_none)(struct se_task *); + /* + * For SCF_SCSI_CONTROL_NONSG_IO_CDB + */ + int (*map_task_non_SG)(struct se_task *); + /* + * For SCF_SCSI_DATA_SG_IO_CDB and SCF_SCSI_CONTROL_SG_IO_CDB + */ + int (*map_task_SG)(struct se_task *); + /* + * attach_hba(): + */ + int (*attach_hba)(struct se_hba *, u32); + /* + * detach_hba(): + */ + void (*detach_hba)(struct se_hba *); + /* + * pmode_hba(): Used for TCM/pSCSI subsystem plugin HBA -> + * Linux/SCSI struct Scsi_Host passthrough + */ + int (*pmode_enable_hba)(struct se_hba *, unsigned long); + /* + * allocate_virtdevice(): + */ + void *(*allocate_virtdevice)(struct se_hba *, const char *); + /* + * create_virtdevice(): Only for Virtual HBAs + */ + struct se_device *(*create_virtdevice)(struct se_hba *, + struct se_subsystem_dev *, void *); + /* + * free_device(): + */ + void (*free_device)(void *); + + /* + * dpo_emulated(): + */ + int (*dpo_emulated)(struct se_device *); + /* + * fua_write_emulated(): + */ + int (*fua_write_emulated)(struct se_device *); + /* + * fua_read_emulated(): + */ + int (*fua_read_emulated)(struct se_device *); + /* + * write_cache_emulated(): + */ + int (*write_cache_emulated)(struct se_device *); + /* + * transport_complete(): + * + * Use transport_generic_complete() for majority of DAS transport + * drivers. Provided out of convenience. + */ + int (*transport_complete)(struct se_task *task); + struct se_task *(*alloc_task)(struct se_cmd *); + /* + * do_task(): + */ + int (*do_task)(struct se_task *); + /* + * Used by virtual subsystem plugins IBLOCK and FILEIO to emulate + * UNMAP and WRITE_SAME_* w/ UNMAP=1 <-> Linux/Block Discard + */ + int (*do_discard)(struct se_device *, sector_t, u32); + /* + * Used by virtual subsystem plugins IBLOCK and FILEIO to emulate + * SYNCHRONIZE_CACHE_* <-> Linux/Block blkdev_issue_flush() + */ + void (*do_sync_cache)(struct se_task *); + /* + * free_task(): + */ + void (*free_task)(struct se_task *); + /* + * check_configfs_dev_params(): + */ + ssize_t (*check_configfs_dev_params)(struct se_hba *, struct se_subsystem_dev *); + /* + * set_configfs_dev_params(): + */ + ssize_t (*set_configfs_dev_params)(struct se_hba *, struct se_subsystem_dev *, + const char *, ssize_t); + /* + * show_configfs_dev_params(): + */ + ssize_t (*show_configfs_dev_params)(struct se_hba *, struct se_subsystem_dev *, + char *); + /* + * get_cdb(): + */ + unsigned char *(*get_cdb)(struct se_task *); + /* + * get_device_rev(): + */ + u32 (*get_device_rev)(struct se_device *); + /* + * get_device_type(): + */ + u32 (*get_device_type)(struct se_device *); + /* + * Get the sector_t from a subsystem backstore.. + */ + sector_t (*get_blocks)(struct se_device *); + /* + * do_se_mem_map(): + */ + int (*do_se_mem_map)(struct se_task *, struct list_head *, void *, + struct se_mem *, struct se_mem **, u32 *, u32 *); + /* + * get_sense_buffer(): + */ + unsigned char *(*get_sense_buffer)(struct se_task *); +} ____cacheline_aligned; + +#define TRANSPORT(dev) ((dev)->transport) +#define HBA_TRANSPORT(hba) ((hba)->transport) + +extern struct se_global *se_global; + +#endif /* TARGET_CORE_TRANSPORT_H */ -- cgit v1.2.3 From 9875cf806403fae66b2410a3c2cc820d97731e04 Mon Sep 17 00:00:00 2001 From: David Howells Date: Fri, 14 Jan 2011 18:45:21 +0000 Subject: Add a dentry op to handle automounting rather than abusing follow_link() Add a dentry op (d_automount) to handle automounting directories rather than abusing the follow_link() inode operation. The operation is keyed off a new dentry flag (DCACHE_NEED_AUTOMOUNT). This also makes it easier to add an AT_ flag to suppress terminal segment automount during pathwalk and removes the need for the kludge code in the pathwalk algorithm to handle directories with follow_link() semantics. The ->d_automount() dentry operation: struct vfsmount *(*d_automount)(struct path *mountpoint); takes a pointer to the directory to be mounted upon, which is expected to provide sufficient data to determine what should be mounted. If successful, it should return the vfsmount struct it creates (which it should also have added to the namespace using do_add_mount() or similar). If there's a collision with another automount attempt, NULL should be returned. If the directory specified by the parameter should be used directly rather than being mounted upon, -EISDIR should be returned. In any other case, an error code should be returned. The ->d_automount() operation is called with no locks held and may sleep. At this point the pathwalk algorithm will be in ref-walk mode. Within fs/namei.c itself, a new pathwalk subroutine (follow_automount()) is added to handle mountpoints. It will return -EREMOTE if the automount flag was set, but no d_automount() op was supplied, -ELOOP if we've encountered too many symlinks or mountpoints, -EISDIR if the walk point should be used without mounting and 0 if successful. The path will be updated to point to the mounted filesystem if a successful automount took place. __follow_mount() is replaced by follow_managed() which is more generic (especially with the patch that adds ->d_manage()). This handles transits from directories during pathwalk, including automounting and skipping over mountpoints (and holding processes with the next patch). __follow_mount_rcu() will jump out of RCU-walk mode if it encounters an automount point with nothing mounted on it. follow_dotdot*() does not handle automounts as you don't want to trigger them whilst following "..". I've also extracted the mount/don't-mount logic from autofs4 and included it here. It makes the mount go ahead anyway if someone calls open() or creat(), tries to traverse the directory, tries to chdir/chroot/etc. into the directory, or sticks a '/' on the end of the pathname. If they do a stat(), however, they'll only trigger the automount if they didn't also say O_NOFOLLOW. I've also added an inode flag (S_AUTOMOUNT) so that filesystems can mark their inodes as automount points. This flag is automatically propagated to the dentry as DCACHE_NEED_AUTOMOUNT by __d_instantiate(). This saves NFS and could save AFS a private flag bit apiece, but is not strictly necessary. It would be preferable to do the propagation in d_set_d_op(), but that doesn't normally have access to the inode. [AV: fixed breakage in case if __follow_mount_rcu() fails and nameidata_drop_rcu() succeeds in RCU case of do_lookup(); we need to fall through to non-RCU case after that, rather than just returning with ungrabbed *path] Signed-off-by: David Howells Was-Acked-by: Ian Kent Signed-off-by: Al Viro --- Documentation/filesystems/Locking | 2 + Documentation/filesystems/vfs.txt | 14 +++ fs/dcache.c | 5 +- fs/namei.c | 226 ++++++++++++++++++++++++++++---------- include/linux/dcache.h | 7 +- include/linux/fs.h | 2 + 6 files changed, 198 insertions(+), 58 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index 977d8919cc69..5f0c52a07386 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -19,6 +19,7 @@ prototypes: void (*d_release)(struct dentry *); void (*d_iput)(struct dentry *, struct inode *); char *(*d_dname)((struct dentry *dentry, char *buffer, int buflen); + struct vfsmount *(*d_automount)(struct path *path); locking rules: rename_lock ->d_lock may block rcu-walk @@ -29,6 +30,7 @@ d_delete: no yes no no d_release: no no yes no d_iput: no no yes no d_dname: no no no no +d_automount: no no yes no --------------------------- inode_operations --------------------------- prototypes: diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt index cae6d27c9f5b..726a4f6fa3c9 100644 --- a/Documentation/filesystems/vfs.txt +++ b/Documentation/filesystems/vfs.txt @@ -864,6 +864,7 @@ struct dentry_operations { void (*d_release)(struct dentry *); void (*d_iput)(struct dentry *, struct inode *); char *(*d_dname)(struct dentry *, char *, int); + struct vfsmount *(*d_automount)(struct path *); }; d_revalidate: called when the VFS needs to revalidate a dentry. This @@ -930,6 +931,19 @@ struct dentry_operations { at the end of the buffer, and returns a pointer to the first char. dynamic_dname() helper function is provided to take care of this. + d_automount: called when an automount dentry is to be traversed (optional). + This should create a new VFS mount record, mount it on the directory + and return the record to the caller. The caller is supplied with a + path parameter giving the automount directory to describe the automount + target and the parent VFS mount record to provide inheritable mount + parameters. NULL should be returned if someone else managed to make + the automount first. If the automount failed, then an error code + should be returned. + + This function is only used if DCACHE_NEED_AUTOMOUNT is set on the + dentry. This is set by __d_instantiate() if S_AUTOMOUNT is set on the + inode being added. + Example : static char *pipefs_dname(struct dentry *dent, char *buffer, int buflen) diff --git a/fs/dcache.c b/fs/dcache.c index 0c6d5c549d84..51f7bb6463af 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -1380,8 +1380,11 @@ EXPORT_SYMBOL(d_set_d_op); static void __d_instantiate(struct dentry *dentry, struct inode *inode) { spin_lock(&dentry->d_lock); - if (inode) + if (inode) { + if (unlikely(IS_AUTOMOUNT(inode))) + dentry->d_flags |= DCACHE_NEED_AUTOMOUNT; list_add(&dentry->d_alias, &inode->i_dentry); + } dentry->d_inode = inode; dentry_rcuwalk_barrier(dentry); spin_unlock(&dentry->d_lock); diff --git a/fs/namei.c b/fs/namei.c index 529e917ad2fc..16109da68bbf 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -896,51 +896,120 @@ int follow_up(struct path *path) } /* - * serialization is taken care of in namespace.c + * Perform an automount + * - return -EISDIR to tell follow_managed() to stop and return the path we + * were called with. */ -static void __follow_mount_rcu(struct nameidata *nd, struct path *path, - struct inode **inode) +static int follow_automount(struct path *path, unsigned flags, + bool *need_mntput) { - while (d_mountpoint(path->dentry)) { - struct vfsmount *mounted; - mounted = __lookup_mnt(path->mnt, path->dentry, 1); - if (!mounted) - return; - path->mnt = mounted; - path->dentry = mounted->mnt_root; - nd->seq = read_seqcount_begin(&path->dentry->d_seq); - *inode = path->dentry->d_inode; + struct vfsmount *mnt; + + if (!path->dentry->d_op || !path->dentry->d_op->d_automount) + return -EREMOTE; + + /* We want to mount if someone is trying to open/create a file of any + * type under the mountpoint, wants to traverse through the mountpoint + * or wants to open the mounted directory. + * + * We don't want to mount if someone's just doing a stat and they've + * set AT_SYMLINK_NOFOLLOW - unless they're stat'ing a directory and + * appended a '/' to the name. + */ + if (!(flags & LOOKUP_FOLLOW) && + !(flags & (LOOKUP_CONTINUE | LOOKUP_DIRECTORY | + LOOKUP_OPEN | LOOKUP_CREATE))) + return -EISDIR; + + current->total_link_count++; + if (current->total_link_count >= 40) + return -ELOOP; + + mnt = path->dentry->d_op->d_automount(path); + if (IS_ERR(mnt)) { + /* + * The filesystem is allowed to return -EISDIR here to indicate + * it doesn't want to automount. For instance, autofs would do + * this so that its userspace daemon can mount on this dentry. + * + * However, we can only permit this if it's a terminal point in + * the path being looked up; if it wasn't then the remainder of + * the path is inaccessible and we should say so. + */ + if (PTR_ERR(mnt) == -EISDIR && (flags & LOOKUP_CONTINUE)) + return -EREMOTE; + return PTR_ERR(mnt); } -} + if (!mnt) /* mount collision */ + return 0; -static int __follow_mount(struct path *path) -{ - int res = 0; - while (d_mountpoint(path->dentry)) { - struct vfsmount *mounted = lookup_mnt(path); - if (!mounted) - break; - dput(path->dentry); - if (res) - mntput(path->mnt); - path->mnt = mounted; - path->dentry = dget(mounted->mnt_root); - res = 1; + if (mnt->mnt_sb == path->mnt->mnt_sb && + mnt->mnt_root == path->dentry) { + mntput(mnt); + return -ELOOP; } - return res; + + dput(path->dentry); + if (*need_mntput) + mntput(path->mnt); + path->mnt = mnt; + path->dentry = dget(mnt->mnt_root); + *need_mntput = true; + return 0; } -static void follow_mount(struct path *path) +/* + * Handle a dentry that is managed in some way. + * - Flagged as mountpoint + * - Flagged as automount point + * + * This may only be called in refwalk mode. + * + * Serialization is taken care of in namespace.c + */ +static int follow_managed(struct path *path, unsigned flags) { - while (d_mountpoint(path->dentry)) { - struct vfsmount *mounted = lookup_mnt(path); - if (!mounted) - break; - dput(path->dentry); - mntput(path->mnt); - path->mnt = mounted; - path->dentry = dget(mounted->mnt_root); + unsigned managed; + bool need_mntput = false; + int ret; + + /* Given that we're not holding a lock here, we retain the value in a + * local variable for each dentry as we look at it so that we don't see + * the components of that value change under us */ + while (managed = ACCESS_ONCE(path->dentry->d_flags), + managed &= DCACHE_MANAGED_DENTRY, + unlikely(managed != 0)) { + /* Transit to a mounted filesystem. */ + if (managed & DCACHE_MOUNTED) { + struct vfsmount *mounted = lookup_mnt(path); + if (mounted) { + dput(path->dentry); + if (need_mntput) + mntput(path->mnt); + path->mnt = mounted; + path->dentry = dget(mounted->mnt_root); + need_mntput = true; + continue; + } + + /* Something is mounted on this dentry in another + * namespace and/or whatever was mounted there in this + * namespace got unmounted before we managed to get the + * vfsmount_lock */ + } + + /* Handle an automount point */ + if (managed & DCACHE_NEED_AUTOMOUNT) { + ret = follow_automount(path, flags, &need_mntput); + if (ret < 0) + return ret == -EISDIR ? 0 : ret; + continue; + } + + /* We didn't change the current path point */ + break; } + return 0; } int follow_down(struct path *path) @@ -958,13 +1027,37 @@ int follow_down(struct path *path) return 0; } +/* + * Skip to top of mountpoint pile in rcuwalk mode. We abort the rcu-walk if we + * meet an automount point and we're not walking to "..". True is returned to + * continue, false to abort. + */ +static bool __follow_mount_rcu(struct nameidata *nd, struct path *path, + struct inode **inode, bool reverse_transit) +{ + while (d_mountpoint(path->dentry)) { + struct vfsmount *mounted; + mounted = __lookup_mnt(path->mnt, path->dentry, 1); + if (!mounted) + break; + path->mnt = mounted; + path->dentry = mounted->mnt_root; + nd->seq = read_seqcount_begin(&path->dentry->d_seq); + *inode = path->dentry->d_inode; + } + + if (unlikely(path->dentry->d_flags & DCACHE_NEED_AUTOMOUNT)) + return reverse_transit; + return true; +} + static int follow_dotdot_rcu(struct nameidata *nd) { struct inode *inode = nd->inode; set_root_rcu(nd); - while(1) { + while (1) { if (nd->path.dentry == nd->root.dentry && nd->path.mnt == nd->root.mnt) { break; @@ -987,12 +1080,28 @@ static int follow_dotdot_rcu(struct nameidata *nd) nd->seq = read_seqcount_begin(&nd->path.dentry->d_seq); inode = nd->path.dentry->d_inode; } - __follow_mount_rcu(nd, &nd->path, &inode); + __follow_mount_rcu(nd, &nd->path, &inode, true); nd->inode = inode; return 0; } +/* + * Skip to top of mountpoint pile in refwalk mode for follow_dotdot() + */ +static void follow_mount(struct path *path) +{ + while (d_mountpoint(path->dentry)) { + struct vfsmount *mounted = lookup_mnt(path); + if (!mounted) + break; + dput(path->dentry); + mntput(path->mnt); + path->mnt = mounted; + path->dentry = dget(mounted->mnt_root); + } +} + static void follow_dotdot(struct nameidata *nd) { set_root(nd); @@ -1057,12 +1166,14 @@ static int do_lookup(struct nameidata *nd, struct qstr *name, struct vfsmount *mnt = nd->path.mnt; struct dentry *dentry, *parent = nd->path.dentry; struct inode *dir; + int err; + /* * See if the low-level filesystem might want * to use its own hash.. */ if (unlikely(parent->d_flags & DCACHE_OP_HASH)) { - int err = parent->d_op->d_hash(parent, nd->inode, name); + err = parent->d_op->d_hash(parent, nd->inode, name); if (err < 0) return err; } @@ -1092,20 +1203,25 @@ static int do_lookup(struct nameidata *nd, struct qstr *name, done2: path->mnt = mnt; path->dentry = dentry; - __follow_mount_rcu(nd, path, inode); - } else { - dentry = __d_lookup(parent, name); - if (!dentry) - goto need_lookup; + if (likely(__follow_mount_rcu(nd, path, inode, false))) + return 0; + if (nameidata_drop_rcu(nd)) + return -ECHILD; + /* fallthru */ + } + dentry = __d_lookup(parent, name); + if (!dentry) + goto need_lookup; found: - if (dentry->d_flags & DCACHE_OP_REVALIDATE) - goto need_revalidate; + if (dentry->d_flags & DCACHE_OP_REVALIDATE) + goto need_revalidate; done: - path->mnt = mnt; - path->dentry = dentry; - __follow_mount(path); - *inode = path->dentry->d_inode; - } + path->mnt = mnt; + path->dentry = dentry; + err = follow_managed(path, nd->flags); + if (unlikely(err < 0)) + return err; + *inode = path->dentry->d_inode; return 0; need_lookup: @@ -2203,11 +2319,9 @@ static struct file *do_last(struct nameidata *nd, struct path *path, if (open_flag & O_EXCL) goto exit_dput; - if (__follow_mount(path)) { - error = -ELOOP; - if (open_flag & O_NOFOLLOW) - goto exit_dput; - } + error = follow_managed(path, nd->flags); + if (error < 0) + goto exit_dput; error = -ENOENT; if (!path->dentry->d_inode) diff --git a/include/linux/dcache.h b/include/linux/dcache.h index 59fcd24b1468..ee6c26d142c3 100644 --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -167,6 +167,7 @@ struct dentry_operations { void (*d_release)(struct dentry *); void (*d_iput)(struct dentry *, struct inode *); char *(*d_dname)(struct dentry *, char *, int); + struct vfsmount *(*d_automount)(struct path *); } ____cacheline_aligned; /* @@ -205,13 +206,17 @@ struct dentry_operations { #define DCACHE_CANT_MOUNT 0x0100 #define DCACHE_GENOCIDE 0x0200 -#define DCACHE_MOUNTED 0x0400 /* is a mountpoint */ #define DCACHE_OP_HASH 0x1000 #define DCACHE_OP_COMPARE 0x2000 #define DCACHE_OP_REVALIDATE 0x4000 #define DCACHE_OP_DELETE 0x8000 +#define DCACHE_MOUNTED 0x10000 /* is a mountpoint */ +#define DCACHE_NEED_AUTOMOUNT 0x20000 /* handle automount on this dir */ +#define DCACHE_MANAGED_DENTRY \ + (DCACHE_MOUNTED|DCACHE_NEED_AUTOMOUNT) + extern seqlock_t rename_lock; static inline int dname_external(struct dentry *dentry) diff --git a/include/linux/fs.h b/include/linux/fs.h index 3984f2358d1f..4c00ed53be8f 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -242,6 +242,7 @@ struct inodes_stat_t { #define S_SWAPFILE 256 /* Do not truncate: swapon got its bmaps */ #define S_PRIVATE 512 /* Inode is fs-internal */ #define S_IMA 1024 /* Inode has an associated IMA struct */ +#define S_AUTOMOUNT 2048 /* Automount/referral quasi-directory */ /* * Note that nosuid etc flags are inode-specific: setting some file-system @@ -277,6 +278,7 @@ struct inodes_stat_t { #define IS_SWAPFILE(inode) ((inode)->i_flags & S_SWAPFILE) #define IS_PRIVATE(inode) ((inode)->i_flags & S_PRIVATE) #define IS_IMA(inode) ((inode)->i_flags & S_IMA) +#define IS_AUTOMOUNT(inode) ((inode)->i_flags & S_AUTOMOUNT) /* the read-only stuff doesn't really belong here, but any other place is probably as bad and I don't want to create yet another include file. */ -- cgit v1.2.3 From cc53ce53c86924bfe98a12ea20b7465038a08792 Mon Sep 17 00:00:00 2001 From: David Howells Date: Fri, 14 Jan 2011 18:45:26 +0000 Subject: Add a dentry op to allow processes to be held during pathwalk transit Add a dentry op (d_manage) to permit a filesystem to hold a process and make it sleep when it tries to transit away from one of that filesystem's directories during a pathwalk. The operation is keyed off a new dentry flag (DCACHE_MANAGE_TRANSIT). The filesystem is allowed to be selective about which processes it holds and which it permits to continue on or prohibits from transiting from each flagged directory. This will allow autofs to hold up client processes whilst letting its userspace daemon through to maintain the directory or the stuff behind it or mounted upon it. The ->d_manage() dentry operation: int (*d_manage)(struct path *path, bool mounting_here); takes a pointer to the directory about to be transited away from and a flag indicating whether the transit is undertaken by do_add_mount() or do_move_mount() skipping through a pile of filesystems mounted on a mountpoint. It should return 0 if successful and to let the process continue on its way; -EISDIR to prohibit the caller from skipping to overmounted filesystems or automounting, and to use this directory; or some other error code to return to the user. ->d_manage() is called with namespace_sem writelocked if mounting_here is true and no other locks held, so it may sleep. However, if mounting_here is true, it may not initiate or wait for a mount or unmount upon the parameter directory, even if the act is actually performed by userspace. Within fs/namei.c, follow_managed() is extended to check with d_manage() first on each managed directory, before transiting away from it or attempting to automount upon it. follow_down() is renamed follow_down_one() and should only be used where the filesystem deliberately intends to avoid management steps (e.g. autofs). A new follow_down() is added that incorporates the loop done by all other callers of follow_down() (do_add/move_mount(), autofs and NFSD; whilst AFS, NFS and CIFS do use it, their use is removed by converting them to use d_automount()). The new follow_down() calls d_manage() as appropriate. It also takes an extra parameter to indicate if it is being called from mount code (with namespace_sem writelocked) which it passes to d_manage(). follow_down() ignores automount points so that it can be used to mount on them. __follow_mount_rcu() is made to abort rcu-walk mode if it hits a directory with DCACHE_MANAGE_TRANSIT set on the basis that we're probably going to have to sleep. It would be possible to enter d_manage() in rcu-walk mode too, and have that determine whether to abort or not itself. That would allow the autofs daemon to continue on in rcu-walk mode. Note that DCACHE_MANAGE_TRANSIT on a directory should be cleared when it isn't required as every tranist from that directory will cause d_manage() to be invoked. It can always be set again when necessary. ========================== WHAT THIS MEANS FOR AUTOFS ========================== Autofs currently uses the lookup() inode op and the d_revalidate() dentry op to trigger the automounting of indirect mounts, and both of these can be called with i_mutex held. autofs knows that the i_mutex will be held by the caller in lookup(), and so can drop it before invoking the daemon - but this isn't so for d_revalidate(), since the lock is only held on _some_ of the code paths that call it. This means that autofs can't risk dropping i_mutex from its d_revalidate() function before it calls the daemon. The bug could manifest itself as, for example, a process that's trying to validate an automount dentry that gets made to wait because that dentry is expired and needs cleaning up: mkdir S ffffffff8014e05a 0 32580 24956 Call Trace: [] :autofs4:autofs4_wait+0x674/0x897 [] avc_has_perm+0x46/0x58 [] autoremove_wake_function+0x0/0x2e [] :autofs4:autofs4_expire_wait+0x41/0x6b [] :autofs4:autofs4_revalidate+0x91/0x149 [] __lookup_hash+0xa0/0x12f [] lookup_create+0x46/0x80 [] sys_mkdirat+0x56/0xe4 versus the automount daemon which wants to remove that dentry, but can't because the normal process is holding the i_mutex lock: automount D ffffffff8014e05a 0 32581 1 32561 Call Trace: [] __mutex_lock_slowpath+0x60/0x9b [] do_path_lookup+0x2ca/0x2f1 [] .text.lock.mutex+0xf/0x14 [] do_rmdir+0x77/0xde [] tracesys+0x71/0xe0 [] tracesys+0xd5/0xe0 which means that the system is deadlocked. This patch allows autofs to hold up normal processes whilst the daemon goes ahead and does things to the dentry tree behind the automouter point without risking a deadlock as almost no locks are held in d_manage() and none in d_automount(). Signed-off-by: David Howells Was-Acked-by: Ian Kent Signed-off-by: Al Viro --- Documentation/filesystems/Locking | 2 ++ Documentation/filesystems/vfs.txt | 21 +++++++++++- drivers/staging/autofs/dirhash.c | 5 ++- fs/afs/mntpt.c | 5 +-- fs/autofs4/autofs_i.h | 13 ------- fs/autofs4/dev-ioctl.c | 2 +- fs/autofs4/expire.c | 2 +- fs/autofs4/root.c | 11 +++--- fs/cifs/cifs_dfs_ref.c | 5 +-- fs/namei.c | 72 +++++++++++++++++++++++++++++++++++++-- fs/namespace.c | 14 ++++---- fs/nfs/namespace.c | 5 +-- fs/nfsd/vfs.c | 5 +-- include/linux/dcache.h | 11 ++++-- include/linux/namei.h | 3 +- 15 files changed, 126 insertions(+), 50 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index 5f0c52a07386..cbf98b989b11 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -20,6 +20,7 @@ prototypes: void (*d_iput)(struct dentry *, struct inode *); char *(*d_dname)((struct dentry *dentry, char *buffer, int buflen); struct vfsmount *(*d_automount)(struct path *path); + int (*d_manage)(struct dentry *, bool); locking rules: rename_lock ->d_lock may block rcu-walk @@ -31,6 +32,7 @@ d_release: no no yes no d_iput: no no yes no d_dname: no no no no d_automount: no no yes no +d_manage: no no yes no --------------------------- inode_operations --------------------------- prototypes: diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt index 726a4f6fa3c9..4682586b147a 100644 --- a/Documentation/filesystems/vfs.txt +++ b/Documentation/filesystems/vfs.txt @@ -865,6 +865,7 @@ struct dentry_operations { void (*d_iput)(struct dentry *, struct inode *); char *(*d_dname)(struct dentry *, char *, int); struct vfsmount *(*d_automount)(struct path *); + int (*d_manage)(struct dentry *, bool); }; d_revalidate: called when the VFS needs to revalidate a dentry. This @@ -938,12 +939,30 @@ struct dentry_operations { target and the parent VFS mount record to provide inheritable mount parameters. NULL should be returned if someone else managed to make the automount first. If the automount failed, then an error code - should be returned. + should be returned. If -EISDIR is returned, then the directory will + be treated as an ordinary directory and returned to pathwalk to + continue walking. This function is only used if DCACHE_NEED_AUTOMOUNT is set on the dentry. This is set by __d_instantiate() if S_AUTOMOUNT is set on the inode being added. + d_manage: called to allow the filesystem to manage the transition from a + dentry (optional). This allows autofs, for example, to hold up clients + waiting to explore behind a 'mountpoint' whilst letting the daemon go + past and construct the subtree there. 0 should be returned to let the + calling process continue. -EISDIR can be returned to tell pathwalk to + use this directory as an ordinary directory and to ignore anything + mounted on it and not to check the automount flag. Any other error + code will abort pathwalk completely. + + If the 'mounting_here' parameter is true, then namespace_sem is being + held by the caller and the function should not initiate any mounts or + unmounts that it will then wait for. + + This function is only used if DCACHE_MANAGE_TRANSIT is set on the + dentry being transited from. + Example : static char *pipefs_dname(struct dentry *dent, char *buffer, int buflen) diff --git a/drivers/staging/autofs/dirhash.c b/drivers/staging/autofs/dirhash.c index d3f42c8325f7..a08bd7355035 100644 --- a/drivers/staging/autofs/dirhash.c +++ b/drivers/staging/autofs/dirhash.c @@ -88,14 +88,13 @@ struct autofs_dir_ent *autofs_expire(struct super_block *sb, } path.mnt = mnt; path_get(&path); - if (!follow_down(&path)) { + if (!follow_down_one(&path)) { path_put(&path); DPRINTK(("autofs: not expirable\ (not a mounted directory): %s\n", ent->name)); continue; } - while (d_mountpoint(path.dentry) && follow_down(&path)) - ; + follow_down(&path, false); // TODO: need to check error umount_ok = may_umount(path.mnt); path_put(&path); diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c index e83c0336e7b5..f3e891d57a2c 100644 --- a/fs/afs/mntpt.c +++ b/fs/afs/mntpt.c @@ -273,10 +273,7 @@ static void *afs_mntpt_follow_link(struct dentry *dentry, struct nameidata *nd) break; case -EBUSY: /* someone else made a mount here whilst we were busy */ - while (d_mountpoint(nd->path.dentry) && - follow_down(&nd->path)) - ; - err = 0; + err = follow_down(&nd->path, false); default: mntput(newmnt); break; diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h index 0fffe1c24cec..eb67953452bb 100644 --- a/fs/autofs4/autofs_i.h +++ b/fs/autofs4/autofs_i.h @@ -229,19 +229,6 @@ int autofs4_wait(struct autofs_sb_info *,struct dentry *, enum autofs_notify); int autofs4_wait_release(struct autofs_sb_info *,autofs_wqt_t,int); void autofs4_catatonic_mode(struct autofs_sb_info *); -static inline int autofs4_follow_mount(struct path *path) -{ - int res = 0; - - while (d_mountpoint(path->dentry)) { - int followed = follow_down(path); - if (!followed) - break; - res = 1; - } - return res; -} - static inline u32 autofs4_get_dev(struct autofs_sb_info *sbi) { return new_encode_dev(sbi->sb->s_dev); diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c index eff9a419469a..1442da4860e5 100644 --- a/fs/autofs4/dev-ioctl.c +++ b/fs/autofs4/dev-ioctl.c @@ -551,7 +551,7 @@ static int autofs_dev_ioctl_ismountpoint(struct file *fp, err = have_submounts(path.dentry); - if (follow_down(&path)) + if (follow_down_one(&path)) magic = path.mnt->mnt_sb->s_magic; } diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c index cc1d01365905..6a930b90d389 100644 --- a/fs/autofs4/expire.c +++ b/fs/autofs4/expire.c @@ -56,7 +56,7 @@ static int autofs4_mount_busy(struct vfsmount *mnt, struct dentry *dentry) path_get(&path); - if (!follow_down(&path)) + if (!follow_down_one(&path)) goto done; if (is_autofs4_dentry(path.dentry)) { diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c index 651e4ef563b1..20225636a4e9 100644 --- a/fs/autofs4/root.c +++ b/fs/autofs4/root.c @@ -234,7 +234,7 @@ static void *autofs4_follow_link(struct dentry *dentry, struct nameidata *nd) nd->flags); /* * For an expire of a covered direct or offset mount we need - * to break out of follow_down() at the autofs mount trigger + * to break out of follow_down_one() at the autofs mount trigger * (d_mounted--), so we can see the expiring flag, and manage * the blocking and following here until the expire is completed. */ @@ -243,7 +243,7 @@ static void *autofs4_follow_link(struct dentry *dentry, struct nameidata *nd) if (ino->flags & AUTOFS_INF_EXPIRING) { spin_unlock(&sbi->fs_lock); /* Follow down to our covering mount. */ - if (!follow_down(&nd->path)) + if (!follow_down_one(&nd->path)) goto done; goto follow; } @@ -292,11 +292,10 @@ follow: * multi-mount with no root offset so we don't need * to follow it. */ - if (d_mountpoint(dentry)) { - if (!autofs4_follow_mount(&nd->path)) { - status = -ENOENT; + if (d_managed(dentry)) { + status = follow_down(&nd->path, false); + if (status < 0) goto out_error; - } } done: diff --git a/fs/cifs/cifs_dfs_ref.c b/fs/cifs/cifs_dfs_ref.c index c68a056f27fd..83479cf63f96 100644 --- a/fs/cifs/cifs_dfs_ref.c +++ b/fs/cifs/cifs_dfs_ref.c @@ -273,10 +273,7 @@ static int add_mount_helper(struct vfsmount *newmnt, struct nameidata *nd, break; case -EBUSY: /* someone else made a mount here whilst we were busy */ - while (d_mountpoint(nd->path.dentry) && - follow_down(&nd->path)) - ; - err = 0; + err = follow_down(&nd->path, false); default: mntput(newmnt); break; diff --git a/fs/namei.c b/fs/namei.c index 16109da68bbf..9d3033dc22e9 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -960,6 +960,7 @@ static int follow_automount(struct path *path, unsigned flags, /* * Handle a dentry that is managed in some way. + * - Flagged for transit management (autofs) * - Flagged as mountpoint * - Flagged as automount point * @@ -979,6 +980,16 @@ static int follow_managed(struct path *path, unsigned flags) while (managed = ACCESS_ONCE(path->dentry->d_flags), managed &= DCACHE_MANAGED_DENTRY, unlikely(managed != 0)) { + /* Allow the filesystem to manage the transit without i_mutex + * being held. */ + if (managed & DCACHE_MANAGE_TRANSIT) { + BUG_ON(!path->dentry->d_op); + BUG_ON(!path->dentry->d_op->d_manage); + ret = path->dentry->d_op->d_manage(path->dentry, false); + if (ret < 0) + return ret == -EISDIR ? 0 : ret; + } + /* Transit to a mounted filesystem. */ if (managed & DCACHE_MOUNTED) { struct vfsmount *mounted = lookup_mnt(path); @@ -1012,7 +1023,7 @@ static int follow_managed(struct path *path, unsigned flags) return 0; } -int follow_down(struct path *path) +int follow_down_one(struct path *path) { struct vfsmount *mounted; @@ -1029,14 +1040,19 @@ int follow_down(struct path *path) /* * Skip to top of mountpoint pile in rcuwalk mode. We abort the rcu-walk if we - * meet an automount point and we're not walking to "..". True is returned to + * meet a managed dentry and we're not walking to "..". True is returned to * continue, false to abort. */ static bool __follow_mount_rcu(struct nameidata *nd, struct path *path, struct inode **inode, bool reverse_transit) { + unsigned abort_mask = + reverse_transit ? 0 : DCACHE_MANAGE_TRANSIT; + while (d_mountpoint(path->dentry)) { struct vfsmount *mounted; + if (path->dentry->d_flags & abort_mask) + return true; mounted = __lookup_mnt(path->mnt, path->dentry, 1); if (!mounted) break; @@ -1086,6 +1102,57 @@ static int follow_dotdot_rcu(struct nameidata *nd) return 0; } +/* + * Follow down to the covering mount currently visible to userspace. At each + * point, the filesystem owning that dentry may be queried as to whether the + * caller is permitted to proceed or not. + * + * Care must be taken as namespace_sem may be held (indicated by mounting_here + * being true). + */ +int follow_down(struct path *path, bool mounting_here) +{ + unsigned managed; + int ret; + + while (managed = ACCESS_ONCE(path->dentry->d_flags), + unlikely(managed & DCACHE_MANAGED_DENTRY)) { + /* Allow the filesystem to manage the transit without i_mutex + * being held. + * + * We indicate to the filesystem if someone is trying to mount + * something here. This gives autofs the chance to deny anyone + * other than its daemon the right to mount on its + * superstructure. + * + * The filesystem may sleep at this point. + */ + if (managed & DCACHE_MANAGE_TRANSIT) { + BUG_ON(!path->dentry->d_op); + BUG_ON(!path->dentry->d_op->d_manage); + ret = path->dentry->d_op->d_manage(path->dentry, mounting_here); + if (ret < 0) + return ret == -EISDIR ? 0 : ret; + } + + /* Transit to a mounted filesystem. */ + if (managed & DCACHE_MOUNTED) { + struct vfsmount *mounted = lookup_mnt(path); + if (!mounted) + break; + dput(path->dentry); + mntput(path->mnt); + path->mnt = mounted; + path->dentry = dget(mounted->mnt_root); + continue; + } + + /* Don't handle automount points here */ + break; + } + return 0; +} + /* * Skip to top of mountpoint pile in refwalk mode for follow_dotdot() */ @@ -3530,6 +3597,7 @@ const struct inode_operations page_symlink_inode_operations = { }; EXPORT_SYMBOL(user_path_at); +EXPORT_SYMBOL(follow_down_one); EXPORT_SYMBOL(follow_down); EXPORT_SYMBOL(follow_up); EXPORT_SYMBOL(get_write_access); /* binfmt_aout */ diff --git a/fs/namespace.c b/fs/namespace.c index 3ddfd9046c44..d94ccd6ddafd 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -1844,9 +1844,10 @@ static int do_move_mount(struct path *path, char *old_name) return err; down_write(&namespace_sem); - while (d_mountpoint(path->dentry) && - follow_down(path)) - ; + err = follow_down(path, true); + if (err < 0) + goto out; + err = -EINVAL; if (!check_mnt(path->mnt) || !check_mnt(old_path.mnt)) goto out; @@ -1940,9 +1941,10 @@ int do_add_mount(struct vfsmount *newmnt, struct path *path, down_write(&namespace_sem); /* Something was mounted here while we slept */ - while (d_mountpoint(path->dentry) && - follow_down(path)) - ; + err = follow_down(path, true); + if (err < 0) + goto unlock; + err = -EINVAL; if (!(mnt_flags & MNT_SHRINKABLE) && !check_mnt(path->mnt)) goto unlock; diff --git a/fs/nfs/namespace.c b/fs/nfs/namespace.c index 74aaf3963c10..bfcb933e5755 100644 --- a/fs/nfs/namespace.c +++ b/fs/nfs/namespace.c @@ -176,10 +176,7 @@ out_err: path_put(&nd->path); goto out; out_follow: - while (d_mountpoint(nd->path.dentry) && - follow_down(&nd->path)) - ; - err = 0; + err = follow_down(&nd->path, false); goto out; } diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 230b79fbf005..0f79e33a65d7 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -88,8 +88,9 @@ nfsd_cross_mnt(struct svc_rqst *rqstp, struct dentry **dpp, .dentry = dget(dentry)}; int err = 0; - while (d_mountpoint(path.dentry) && follow_down(&path)) - ; + err = follow_down(&path, false); + if (err < 0) + goto out; exp2 = rqst_exp_get_by_name(rqstp, &path); if (IS_ERR(exp2)) { diff --git a/include/linux/dcache.h b/include/linux/dcache.h index ee6c26d142c3..1a87760d6532 100644 --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -168,6 +168,7 @@ struct dentry_operations { void (*d_iput)(struct dentry *, struct inode *); char *(*d_dname)(struct dentry *, char *, int); struct vfsmount *(*d_automount)(struct path *); + int (*d_manage)(struct dentry *, bool); } ____cacheline_aligned; /* @@ -214,8 +215,9 @@ struct dentry_operations { #define DCACHE_MOUNTED 0x10000 /* is a mountpoint */ #define DCACHE_NEED_AUTOMOUNT 0x20000 /* handle automount on this dir */ +#define DCACHE_MANAGE_TRANSIT 0x40000 /* manage transit from this dirent */ #define DCACHE_MANAGED_DENTRY \ - (DCACHE_MOUNTED|DCACHE_NEED_AUTOMOUNT) + (DCACHE_MOUNTED|DCACHE_NEED_AUTOMOUNT|DCACHE_MANAGE_TRANSIT) extern seqlock_t rename_lock; @@ -404,7 +406,12 @@ static inline void dont_mount(struct dentry *dentry) extern void dput(struct dentry *); -static inline int d_mountpoint(struct dentry *dentry) +static inline bool d_managed(struct dentry *dentry) +{ + return dentry->d_flags & DCACHE_MANAGED_DENTRY; +} + +static inline bool d_mountpoint(struct dentry *dentry) { return dentry->d_flags & DCACHE_MOUNTED; } diff --git a/include/linux/namei.h b/include/linux/namei.h index 18d06add0a40..8ef2c789c2a8 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -79,7 +79,8 @@ extern struct file *lookup_instantiate_filp(struct nameidata *nd, struct dentry extern struct dentry *lookup_one_len(const char *, struct dentry *, int); -extern int follow_down(struct path *); +extern int follow_down_one(struct path *); +extern int follow_down(struct path *, bool); extern int follow_up(struct path *); extern struct dentry *lock_rename(struct dentry *, struct dentry *); -- cgit v1.2.3 From ab90911ff90cdab59b31c045c3f0ae480d14f29d Mon Sep 17 00:00:00 2001 From: David Howells Date: Fri, 14 Jan 2011 18:46:51 +0000 Subject: Allow d_manage() to be used in RCU-walk mode Allow d_manage() to be called from pathwalk when it is in RCU-walk mode as well as when it is in Ref-walk mode. This permits __follow_mount_rcu() to call d_manage() directly. d_manage() needs a parameter to indicate that it is in RCU-walk mode as it isn't allowed to sleep if in that mode (but should return -ECHILD instead). autofs4_d_manage() can then be set to retain RCU-walk mode if the daemon accesses it and otherwise request dropping back to ref-walk mode. Signed-off-by: David Howells Signed-off-by: Al Viro --- Documentation/filesystems/Locking | 2 +- Documentation/filesystems/vfs.txt | 7 ++++++- fs/autofs4/root.c | 8 ++++++-- fs/namei.c | 15 ++++++++------- include/linux/dcache.h | 2 +- 5 files changed, 22 insertions(+), 12 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index cbf98b989b11..39707748ed2d 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -32,7 +32,7 @@ d_release: no no yes no d_iput: no no yes no d_dname: no no no no d_automount: no no yes no -d_manage: no no yes no +d_manage: no no yes (ref-walk) maybe --------------------------- inode_operations --------------------------- prototypes: diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt index 4682586b147a..3c4b2f1b64d0 100644 --- a/Documentation/filesystems/vfs.txt +++ b/Documentation/filesystems/vfs.txt @@ -865,7 +865,7 @@ struct dentry_operations { void (*d_iput)(struct dentry *, struct inode *); char *(*d_dname)(struct dentry *, char *, int); struct vfsmount *(*d_automount)(struct path *); - int (*d_manage)(struct dentry *, bool); + int (*d_manage)(struct dentry *, bool, bool); }; d_revalidate: called when the VFS needs to revalidate a dentry. This @@ -960,6 +960,11 @@ struct dentry_operations { held by the caller and the function should not initiate any mounts or unmounts that it will then wait for. + If the 'rcu_walk' parameter is true, then the caller is doing a + pathwalk in RCU-walk mode. Sleeping is not permitted in this mode, + and the caller can be asked to leave it and call again by returing + -ECHILD. + This function is only used if DCACHE_MANAGE_TRANSIT is set on the dentry being transited from. diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c index 9194e274f849..dbd95512808c 100644 --- a/fs/autofs4/root.c +++ b/fs/autofs4/root.c @@ -36,7 +36,7 @@ static long autofs4_root_compat_ioctl(struct file *,unsigned int,unsigned long); static int autofs4_dir_open(struct inode *inode, struct file *file); static struct dentry *autofs4_lookup(struct inode *,struct dentry *, struct nameidata *); static struct vfsmount *autofs4_d_automount(struct path *); -static int autofs4_d_manage(struct dentry *, bool); +static int autofs4_d_manage(struct dentry *, bool, bool); const struct file_operations autofs4_root_operations = { .open = dcache_dir_open, @@ -450,7 +450,7 @@ done: return NULL; } -int autofs4_d_manage(struct dentry *dentry, bool mounting_here) +int autofs4_d_manage(struct dentry *dentry, bool mounting_here, bool rcu_walk) { struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb); @@ -464,6 +464,10 @@ int autofs4_d_manage(struct dentry *dentry, bool mounting_here) return 0; } + /* We need to sleep, so we need pathwalk to be in ref-mode */ + if (rcu_walk) + return -ECHILD; + /* Wait for pending expires */ do_expire_wait(dentry); diff --git a/fs/namei.c b/fs/namei.c index 373852012713..5c89695ae1e4 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -987,7 +987,8 @@ static int follow_managed(struct path *path, unsigned flags) if (managed & DCACHE_MANAGE_TRANSIT) { BUG_ON(!path->dentry->d_op); BUG_ON(!path->dentry->d_op->d_manage); - ret = path->dentry->d_op->d_manage(path->dentry, false); + ret = path->dentry->d_op->d_manage(path->dentry, + false, false); if (ret < 0) return ret == -EISDIR ? 0 : ret; } @@ -1048,13 +1049,12 @@ int follow_down_one(struct path *path) static bool __follow_mount_rcu(struct nameidata *nd, struct path *path, struct inode **inode, bool reverse_transit) { - unsigned abort_mask = - reverse_transit ? 0 : DCACHE_MANAGE_TRANSIT; - while (d_mountpoint(path->dentry)) { struct vfsmount *mounted; - if (path->dentry->d_flags & abort_mask) - return true; + if (unlikely(path->dentry->d_flags & DCACHE_MANAGE_TRANSIT) && + !reverse_transit && + path->dentry->d_op->d_manage(path->dentry, false, true) < 0) + return false; mounted = __lookup_mnt(path->mnt, path->dentry, 1); if (!mounted) break; @@ -1132,7 +1132,8 @@ int follow_down(struct path *path, bool mounting_here) if (managed & DCACHE_MANAGE_TRANSIT) { BUG_ON(!path->dentry->d_op); BUG_ON(!path->dentry->d_op->d_manage); - ret = path->dentry->d_op->d_manage(path->dentry, mounting_here); + ret = path->dentry->d_op->d_manage( + path->dentry, mounting_here, false); if (ret < 0) return ret == -EISDIR ? 0 : ret; } diff --git a/include/linux/dcache.h b/include/linux/dcache.h index 1a87760d6532..f958c19e3ca5 100644 --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -168,7 +168,7 @@ struct dentry_operations { void (*d_iput)(struct dentry *, struct inode *); char *(*d_dname)(struct dentry *, char *, int); struct vfsmount *(*d_automount)(struct path *); - int (*d_manage)(struct dentry *, bool); + int (*d_manage)(struct dentry *, bool, bool); } ____cacheline_aligned; /* -- cgit v1.2.3 From ea5b778a8b98c85a87d66bf844904f9c3802b869 Mon Sep 17 00:00:00 2001 From: David Howells Date: Fri, 14 Jan 2011 19:10:03 +0000 Subject: Unexport do_add_mount() and add in follow_automount(), not ->d_automount() Unexport do_add_mount() and make ->d_automount() return the vfsmount to be added rather than calling do_add_mount() itself. follow_automount() will then do the addition. This slightly complicates things as ->d_automount() normally wants to add the new vfsmount to an expiration list and start an expiration timer. The problem with that is that the vfsmount will be deleted if it has a refcount of 1 and the timer will not repeat if the expiration list is empty. To this end, we require the vfsmount to be returned from d_automount() with a refcount of (at least) 2. One of these refs will be dropped unconditionally. In addition, follow_automount() must get a 3rd ref around the call to do_add_mount() lest it eat a ref and return an error, leaving the mount we have open to being expired as we would otherwise have only 1 ref on it. d_automount() should also add the the vfsmount to the expiration list (by calling mnt_set_expiry()) and start the expiration timer before returning, if this mechanism is to be used. The vfsmount will be unlinked from the expiration list by follow_automount() if do_add_mount() fails. This patch also fixes the call to do_add_mount() for AFS to propagate the mount flags from the parent vfsmount. Signed-off-by: David Howells Signed-off-by: Al Viro --- Documentation/filesystems/vfs.txt | 23 ++++++++++++--------- fs/afs/mntpt.c | 25 ++++++----------------- fs/cifs/cifs_dfs_ref.c | 26 ++++++------------------ fs/internal.h | 2 ++ fs/namei.c | 42 ++++++++++++++++++++++++++++++++------- fs/namespace.c | 41 ++++++++++++++++++++++++++++++-------- fs/nfs/namespace.c | 24 ++++------------------ include/linux/mount.h | 7 +------ 8 files changed, 101 insertions(+), 89 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt index 3c4b2f1b64d0..94cf97b901d7 100644 --- a/Documentation/filesystems/vfs.txt +++ b/Documentation/filesystems/vfs.txt @@ -933,15 +933,20 @@ struct dentry_operations { dynamic_dname() helper function is provided to take care of this. d_automount: called when an automount dentry is to be traversed (optional). - This should create a new VFS mount record, mount it on the directory - and return the record to the caller. The caller is supplied with a - path parameter giving the automount directory to describe the automount - target and the parent VFS mount record to provide inheritable mount - parameters. NULL should be returned if someone else managed to make - the automount first. If the automount failed, then an error code - should be returned. If -EISDIR is returned, then the directory will - be treated as an ordinary directory and returned to pathwalk to - continue walking. + This should create a new VFS mount record and return the record to the + caller. The caller is supplied with a path parameter giving the + automount directory to describe the automount target and the parent + VFS mount record to provide inheritable mount parameters. NULL should + be returned if someone else managed to make the automount first. If + the vfsmount creation failed, then an error code should be returned. + If -EISDIR is returned, then the directory will be treated as an + ordinary directory and returned to pathwalk to continue walking. + + If a vfsmount is returned, the caller will attempt to mount it on the + mountpoint and will remove the vfsmount from its expiration list in + the case of failure. The vfsmount should be returned with 2 refs on + it to prevent automatic expiration - the caller will clean up the + additional ref. This function is only used if DCACHE_NEED_AUTOMOUNT is set on the dentry. This is set by __d_instantiate() if S_AUTOMOUNT is set on the diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c index d23b2e344a78..aa59184151d0 100644 --- a/fs/afs/mntpt.c +++ b/fs/afs/mntpt.c @@ -241,7 +241,6 @@ error_no_devname: struct vfsmount *afs_d_automount(struct path *path) { struct vfsmount *newmnt; - int err; _enter("{%s,%s}", path->mnt->mnt_devname, path->dentry->d_name.name); @@ -249,24 +248,12 @@ struct vfsmount *afs_d_automount(struct path *path) if (IS_ERR(newmnt)) return newmnt; - mntget(newmnt); - err = do_add_mount(newmnt, path, MNT_SHRINKABLE, &afs_vfsmounts); - switch (err) { - case 0: - queue_delayed_work(afs_wq, &afs_mntpt_expiry_timer, - afs_mntpt_expiry_timeout * HZ); - _leave(" = %p {%s}", newmnt, newmnt->mnt_devname); - return newmnt; - case -EBUSY: - /* someone else made a mount here whilst we were busy */ - mntput(newmnt); - _leave(" = NULL [EBUSY]"); - return NULL; - default: - mntput(newmnt); - _leave(" = %d", err); - return ERR_PTR(err); - } + mntget(newmnt); /* prevent immediate expiration */ + mnt_set_expiry(newmnt, &afs_vfsmounts); + queue_delayed_work(afs_wq, &afs_mntpt_expiry_timer, + afs_mntpt_expiry_timeout * HZ); + _leave(" = %p {%s}", newmnt, newmnt->mnt_devname); + return newmnt; } /* diff --git a/fs/cifs/cifs_dfs_ref.c b/fs/cifs/cifs_dfs_ref.c index 0fc163808de3..7ed36536e754 100644 --- a/fs/cifs/cifs_dfs_ref.c +++ b/fs/cifs/cifs_dfs_ref.c @@ -351,7 +351,6 @@ free_xid: struct vfsmount *cifs_dfs_d_automount(struct path *path) { struct vfsmount *newmnt; - int err; cFYI(1, "in %s", __func__); @@ -361,25 +360,12 @@ struct vfsmount *cifs_dfs_d_automount(struct path *path) return newmnt; } - mntget(newmnt); - err = do_add_mount(newmnt, path, path->mnt->mnt_flags | MNT_SHRINKABLE, - &cifs_dfs_automount_list); - switch (err) { - case 0: - schedule_delayed_work(&cifs_dfs_automount_task, - cifs_dfs_mountpoint_expiry_timeout); - cFYI(1, "leaving %s [ok]" , __func__); - return newmnt; - case -EBUSY: - /* someone else made a mount here whilst we were busy */ - mntput(newmnt); - cFYI(1, "leaving %s [EBUSY]" , __func__); - return NULL; - default: - mntput(newmnt); - cFYI(1, "leaving %s [error %d]" , __func__, err); - return ERR_PTR(err); - } + mntget(newmnt); /* prevent immediate expiration */ + mnt_set_expiry(newmnt, &cifs_dfs_automount_list); + schedule_delayed_work(&cifs_dfs_automount_task, + cifs_dfs_mountpoint_expiry_timeout); + cFYI(1, "leaving %s [ok]" , __func__); + return newmnt; } const struct inode_operations cifs_dfs_referral_inode_operations = { diff --git a/fs/internal.h b/fs/internal.h index 9687c2ee2735..4931060fd089 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -70,6 +70,8 @@ extern void mnt_set_mountpoint(struct vfsmount *, struct dentry *, extern void release_mounts(struct list_head *); extern void umount_tree(struct vfsmount *, int, struct list_head *); extern struct vfsmount *copy_tree(struct vfsmount *, struct dentry *, int); +extern int do_add_mount(struct vfsmount *, struct path *, int); +extern void mnt_clear_expiry(struct vfsmount *); extern void __init mnt_init(void); diff --git a/fs/namei.c b/fs/namei.c index 5c89695ae1e4..c2e37727e3ab 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -900,6 +900,7 @@ static int follow_automount(struct path *path, unsigned flags, bool *need_mntput) { struct vfsmount *mnt; + int err; if (!path->dentry->d_op || !path->dentry->d_op->d_automount) return -EREMOTE; @@ -942,22 +943,49 @@ static int follow_automount(struct path *path, unsigned flags, return -EREMOTE; return PTR_ERR(mnt); } + if (!mnt) /* mount collision */ return 0; + /* The new mount record should have at least 2 refs to prevent it being + * expired before we get a chance to add it + */ + BUG_ON(mnt_get_count(mnt) < 2); + if (mnt->mnt_sb == path->mnt->mnt_sb && mnt->mnt_root == path->dentry) { + mnt_clear_expiry(mnt); + mntput(mnt); mntput(mnt); return -ELOOP; } - dput(path->dentry); - if (*need_mntput) - mntput(path->mnt); - path->mnt = mnt; - path->dentry = dget(mnt->mnt_root); - *need_mntput = true; - return 0; + /* We need to add the mountpoint to the parent. The filesystem may + * have placed it on an expiry list, and so we need to make sure it + * won't be expired under us if do_add_mount() fails (do_add_mount() + * will eat a reference unconditionally). + */ + mntget(mnt); + err = do_add_mount(mnt, path, path->mnt->mnt_flags | MNT_SHRINKABLE); + switch (err) { + case -EBUSY: + /* Someone else made a mount here whilst we were busy */ + err = 0; + default: + mnt_clear_expiry(mnt); + mntput(mnt); + mntput(mnt); + return err; + case 0: + mntput(mnt); + dput(path->dentry); + if (*need_mntput) + mntput(path->mnt); + path->mnt = mnt; + path->dentry = dget(mnt->mnt_root); + *need_mntput = true; + return 0; + } } /* diff --git a/fs/namespace.c b/fs/namespace.c index d94ccd6ddafd..bfcb701f9490 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -1925,15 +1925,14 @@ static int do_new_mount(struct path *path, char *type, int flags, if (IS_ERR(mnt)) return PTR_ERR(mnt); - return do_add_mount(mnt, path, mnt_flags, NULL); + return do_add_mount(mnt, path, mnt_flags); } /* * add a mount into a namespace's mount tree - * - provide the option of adding the new mount to an expiration list + * - this unconditionally eats one of the caller's references to newmnt. */ -int do_add_mount(struct vfsmount *newmnt, struct path *path, - int mnt_flags, struct list_head *fslist) +int do_add_mount(struct vfsmount *newmnt, struct path *path, int mnt_flags) { int err; @@ -1963,9 +1962,6 @@ int do_add_mount(struct vfsmount *newmnt, struct path *path, if ((err = graft_tree(newmnt, path))) goto unlock; - if (fslist) /* add to the specified expiration list */ - list_add_tail(&newmnt->mnt_expire, fslist); - up_write(&namespace_sem); return 0; @@ -1975,7 +1971,36 @@ unlock: return err; } -EXPORT_SYMBOL_GPL(do_add_mount); +/** + * mnt_set_expiry - Put a mount on an expiration list + * @mnt: The mount to list. + * @expiry_list: The list to add the mount to. + */ +void mnt_set_expiry(struct vfsmount *mnt, struct list_head *expiry_list) +{ + down_write(&namespace_sem); + br_write_lock(vfsmount_lock); + + list_add_tail(&mnt->mnt_expire, expiry_list); + + br_write_unlock(vfsmount_lock); + up_write(&namespace_sem); +} +EXPORT_SYMBOL(mnt_set_expiry); + +/* + * Remove a vfsmount from any expiration list it may be on + */ +void mnt_clear_expiry(struct vfsmount *mnt) +{ + if (!list_empty(&mnt->mnt_expire)) { + down_write(&namespace_sem); + br_write_lock(vfsmount_lock); + list_del_init(&mnt->mnt_expire); + br_write_unlock(vfsmount_lock); + up_write(&namespace_sem); + } +} /* * process a list of expirable mountpoints with the intent of discarding any diff --git a/fs/nfs/namespace.c b/fs/nfs/namespace.c index f3fbb1bf3f18..f32b8603dca8 100644 --- a/fs/nfs/namespace.c +++ b/fs/nfs/namespace.c @@ -149,26 +149,10 @@ struct vfsmount *nfs_d_automount(struct path *path) if (IS_ERR(mnt)) goto out; - mntget(mnt); - err = do_add_mount(mnt, path, path->mnt->mnt_flags | MNT_SHRINKABLE, - &nfs_automount_list); - switch (err) { - case 0: - dprintk("%s: done, success\n", __func__); - schedule_delayed_work(&nfs_automount_task, nfs_mountpoint_expiry_timeout); - break; - case -EBUSY: - /* someone else made a mount here whilst we were busy */ - mntput(mnt); - dprintk("%s: done, collision\n", __func__); - mnt = NULL; - break; - default: - mntput(mnt); - dprintk("%s: done, error %d\n", __func__, err); - mnt = ERR_PTR(err); - break; - } + dprintk("%s: done, success\n", __func__); + mntget(mnt); /* prevent immediate expiration */ + mnt_set_expiry(mnt, &nfs_automount_list); + schedule_delayed_work(&nfs_automount_task, nfs_mountpoint_expiry_timeout); out: nfs_free_fattr(fattr); diff --git a/include/linux/mount.h b/include/linux/mount.h index 1869ea24a739..af4765ea8fde 100644 --- a/include/linux/mount.h +++ b/include/linux/mount.h @@ -110,12 +110,7 @@ extern struct vfsmount *vfs_kern_mount(struct file_system_type *type, int flags, const char *name, void *data); -struct nameidata; - -struct path; -extern int do_add_mount(struct vfsmount *newmnt, struct path *path, - int mnt_flags, struct list_head *fslist); - +extern void mnt_set_expiry(struct vfsmount *mnt, struct list_head *expiry_list); extern void mark_mounts_for_expiry(struct list_head *mounts); extern dev_t name_to_dev_t(char *name); -- cgit v1.2.3 From 2fe17c1075836b66678ed2a305fd09b6773883aa Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Fri, 14 Jan 2011 13:07:43 +0100 Subject: fallocate should be a file operation Currently all filesystems except XFS implement fallocate asynchronously, while XFS forced a commit. Both of these are suboptimal - in case of O_SYNC I/O we really want our allocation on disk, especially for the !KEEP_SIZE case where we actually grow the file with user-visible zeroes. On the other hand always commiting the transaction is a bad idea for fast-path uses of fallocate like for example in recent Samba versions. Given that block allocation is a data plane operation anyway change it from an inode operation to a file operation so that we have the file structure available that lets us check for O_SYNC. This also includes moving the code around for a few of the filesystems, and remove the already unnedded S_ISDIR checks given that we only wire up fallocate for regular files. Signed-off-by: Christoph Hellwig Signed-off-by: Al Viro --- Documentation/filesystems/Locking | 3 +- fs/btrfs/file.c | 113 +++++++++++++++++ fs/btrfs/inode.c | 111 ---------------- fs/ext4/ext4.h | 2 +- fs/ext4/extents.c | 9 +- fs/ext4/file.c | 2 +- fs/gfs2/file.c | 258 ++++++++++++++++++++++++++++++++++++++ fs/gfs2/ops_inode.c | 258 -------------------------------------- fs/ocfs2/file.c | 8 +- fs/open.c | 4 +- fs/xfs/linux-2.6/xfs_file.c | 56 +++++++++ fs/xfs/linux-2.6/xfs_iops.c | 60 --------- include/linux/fs.h | 4 +- 13 files changed, 440 insertions(+), 448 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index 651d5237c155..4471a416c274 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -60,7 +60,6 @@ ata *); ssize_t (*listxattr) (struct dentry *, char *, size_t); int (*removexattr) (struct dentry *, const char *); void (*truncate_range)(struct inode *, loff_t, loff_t); - long (*fallocate)(struct inode *inode, int mode, loff_t offset, loff_t len); int (*fiemap)(struct inode *, struct fiemap_extent_info *, u64 start, u64 len); locking rules: @@ -88,7 +87,6 @@ getxattr: no listxattr: no removexattr: yes truncate_range: yes -fallocate: no fiemap: no Additionally, ->rmdir(), ->unlink() and ->rename() have ->i_mutex on victim. @@ -437,6 +435,7 @@ prototypes: ssize_t (*splice_read)(struct file *, loff_t *, struct pipe_inode_info *, size_t, unsigned int); int (*setlease)(struct file *, long, struct file_lock **); + long (*fallocate)(struct file *, int, loff_t, loff_t); }; locking rules: diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 66836d85763b..a9e0a4eaf3d9 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -24,6 +24,7 @@ #include #include #include +#include #include #include #include @@ -1237,6 +1238,117 @@ static int btrfs_file_mmap(struct file *filp, struct vm_area_struct *vma) return 0; } +static long btrfs_fallocate(struct file *file, int mode, + loff_t offset, loff_t len) +{ + struct inode *inode = file->f_path.dentry->d_inode; + struct extent_state *cached_state = NULL; + u64 cur_offset; + u64 last_byte; + u64 alloc_start; + u64 alloc_end; + u64 alloc_hint = 0; + u64 locked_end; + u64 mask = BTRFS_I(inode)->root->sectorsize - 1; + struct extent_map *em; + int ret; + + alloc_start = offset & ~mask; + alloc_end = (offset + len + mask) & ~mask; + + /* We only support the FALLOC_FL_KEEP_SIZE mode */ + if (mode & ~FALLOC_FL_KEEP_SIZE) + return -EOPNOTSUPP; + + /* + * wait for ordered IO before we have any locks. We'll loop again + * below with the locks held. + */ + btrfs_wait_ordered_range(inode, alloc_start, alloc_end - alloc_start); + + mutex_lock(&inode->i_mutex); + ret = inode_newsize_ok(inode, alloc_end); + if (ret) + goto out; + + if (alloc_start > inode->i_size) { + ret = btrfs_cont_expand(inode, alloc_start); + if (ret) + goto out; + } + + ret = btrfs_check_data_free_space(inode, alloc_end - alloc_start); + if (ret) + goto out; + + locked_end = alloc_end - 1; + while (1) { + struct btrfs_ordered_extent *ordered; + + /* the extent lock is ordered inside the running + * transaction + */ + lock_extent_bits(&BTRFS_I(inode)->io_tree, alloc_start, + locked_end, 0, &cached_state, GFP_NOFS); + ordered = btrfs_lookup_first_ordered_extent(inode, + alloc_end - 1); + if (ordered && + ordered->file_offset + ordered->len > alloc_start && + ordered->file_offset < alloc_end) { + btrfs_put_ordered_extent(ordered); + unlock_extent_cached(&BTRFS_I(inode)->io_tree, + alloc_start, locked_end, + &cached_state, GFP_NOFS); + /* + * we can't wait on the range with the transaction + * running or with the extent lock held + */ + btrfs_wait_ordered_range(inode, alloc_start, + alloc_end - alloc_start); + } else { + if (ordered) + btrfs_put_ordered_extent(ordered); + break; + } + } + + cur_offset = alloc_start; + while (1) { + em = btrfs_get_extent(inode, NULL, 0, cur_offset, + alloc_end - cur_offset, 0); + BUG_ON(IS_ERR(em) || !em); + last_byte = min(extent_map_end(em), alloc_end); + last_byte = (last_byte + mask) & ~mask; + if (em->block_start == EXTENT_MAP_HOLE || + (cur_offset >= inode->i_size && + !test_bit(EXTENT_FLAG_PREALLOC, &em->flags))) { + ret = btrfs_prealloc_file_range(inode, mode, cur_offset, + last_byte - cur_offset, + 1 << inode->i_blkbits, + offset + len, + &alloc_hint); + if (ret < 0) { + free_extent_map(em); + break; + } + } + free_extent_map(em); + + cur_offset = last_byte; + if (cur_offset >= alloc_end) { + ret = 0; + break; + } + } + unlock_extent_cached(&BTRFS_I(inode)->io_tree, alloc_start, locked_end, + &cached_state, GFP_NOFS); + + btrfs_free_reserved_data_space(inode, alloc_end - alloc_start); +out: + mutex_unlock(&inode->i_mutex); + return ret; +} + const struct file_operations btrfs_file_operations = { .llseek = generic_file_llseek, .read = do_sync_read, @@ -1248,6 +1360,7 @@ const struct file_operations btrfs_file_operations = { .open = generic_file_open, .release = btrfs_release_file, .fsync = btrfs_sync_file, + .fallocate = btrfs_fallocate, .unlocked_ioctl = btrfs_ioctl, #ifdef CONFIG_COMPAT .compat_ioctl = btrfs_ioctl, diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 64daf2acd0d5..902afbf50811 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7098,116 +7098,6 @@ int btrfs_prealloc_file_range_trans(struct inode *inode, min_size, actual_len, alloc_hint, trans); } -static long btrfs_fallocate(struct inode *inode, int mode, - loff_t offset, loff_t len) -{ - struct extent_state *cached_state = NULL; - u64 cur_offset; - u64 last_byte; - u64 alloc_start; - u64 alloc_end; - u64 alloc_hint = 0; - u64 locked_end; - u64 mask = BTRFS_I(inode)->root->sectorsize - 1; - struct extent_map *em; - int ret; - - alloc_start = offset & ~mask; - alloc_end = (offset + len + mask) & ~mask; - - /* We only support the FALLOC_FL_KEEP_SIZE mode */ - if (mode & ~FALLOC_FL_KEEP_SIZE) - return -EOPNOTSUPP; - - /* - * wait for ordered IO before we have any locks. We'll loop again - * below with the locks held. - */ - btrfs_wait_ordered_range(inode, alloc_start, alloc_end - alloc_start); - - mutex_lock(&inode->i_mutex); - ret = inode_newsize_ok(inode, alloc_end); - if (ret) - goto out; - - if (alloc_start > inode->i_size) { - ret = btrfs_cont_expand(inode, alloc_start); - if (ret) - goto out; - } - - ret = btrfs_check_data_free_space(inode, alloc_end - alloc_start); - if (ret) - goto out; - - locked_end = alloc_end - 1; - while (1) { - struct btrfs_ordered_extent *ordered; - - /* the extent lock is ordered inside the running - * transaction - */ - lock_extent_bits(&BTRFS_I(inode)->io_tree, alloc_start, - locked_end, 0, &cached_state, GFP_NOFS); - ordered = btrfs_lookup_first_ordered_extent(inode, - alloc_end - 1); - if (ordered && - ordered->file_offset + ordered->len > alloc_start && - ordered->file_offset < alloc_end) { - btrfs_put_ordered_extent(ordered); - unlock_extent_cached(&BTRFS_I(inode)->io_tree, - alloc_start, locked_end, - &cached_state, GFP_NOFS); - /* - * we can't wait on the range with the transaction - * running or with the extent lock held - */ - btrfs_wait_ordered_range(inode, alloc_start, - alloc_end - alloc_start); - } else { - if (ordered) - btrfs_put_ordered_extent(ordered); - break; - } - } - - cur_offset = alloc_start; - while (1) { - em = btrfs_get_extent(inode, NULL, 0, cur_offset, - alloc_end - cur_offset, 0); - BUG_ON(IS_ERR(em) || !em); - last_byte = min(extent_map_end(em), alloc_end); - last_byte = (last_byte + mask) & ~mask; - if (em->block_start == EXTENT_MAP_HOLE || - (cur_offset >= inode->i_size && - !test_bit(EXTENT_FLAG_PREALLOC, &em->flags))) { - ret = btrfs_prealloc_file_range(inode, mode, cur_offset, - last_byte - cur_offset, - 1 << inode->i_blkbits, - offset + len, - &alloc_hint); - if (ret < 0) { - free_extent_map(em); - break; - } - } - free_extent_map(em); - - cur_offset = last_byte; - if (cur_offset >= alloc_end) { - ret = 0; - break; - } - } - unlock_extent_cached(&BTRFS_I(inode)->io_tree, alloc_start, locked_end, - &cached_state, GFP_NOFS); - - btrfs_free_reserved_data_space(inode, alloc_end - alloc_start); -out: - mutex_unlock(&inode->i_mutex); - return ret; -} - static int btrfs_set_page_dirty(struct page *page) { return __set_page_dirty_nobuffers(page); @@ -7310,7 +7200,6 @@ static const struct inode_operations btrfs_file_inode_operations = { .listxattr = btrfs_listxattr, .removexattr = btrfs_removexattr, .permission = btrfs_permission, - .fallocate = btrfs_fallocate, .fiemap = btrfs_fiemap, }; static const struct inode_operations btrfs_special_inode_operations = { diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 1de65f572033..0c8d97b56f34 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -2065,7 +2065,7 @@ extern int ext4_ext_map_blocks(handle_t *handle, struct inode *inode, extern void ext4_ext_truncate(struct inode *); extern void ext4_ext_init(struct super_block *); extern void ext4_ext_release(struct super_block *); -extern long ext4_fallocate(struct inode *inode, int mode, loff_t offset, +extern long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len); extern int ext4_convert_unwritten_extents(struct inode *inode, loff_t offset, ssize_t len); diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 4bdd160854eb..63a75810b7c3 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -3627,14 +3627,15 @@ static void ext4_falloc_update_inode(struct inode *inode, } /* - * preallocate space for a file. This implements ext4's fallocate inode + * preallocate space for a file. This implements ext4's fallocate file * operation, which gets called from sys_fallocate system call. * For block-mapped files, posix_fallocate should fall back to the method * of writing zeroes to the required new blocks (the same behavior which is * expected for file systems which do not support fallocate() system call). */ -long ext4_fallocate(struct inode *inode, int mode, loff_t offset, loff_t len) +long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len) { + struct inode *inode = file->f_path.dentry->d_inode; handle_t *handle; loff_t new_size; unsigned int max_blocks; @@ -3655,10 +3656,6 @@ long ext4_fallocate(struct inode *inode, int mode, loff_t offset, loff_t len) if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))) return -EOPNOTSUPP; - /* preallocation to directories is currently not supported */ - if (S_ISDIR(inode->i_mode)) - return -ENODEV; - map.m_lblk = offset >> blkbits; /* * We can't just convert len to max_blocks because diff --git a/fs/ext4/file.c b/fs/ext4/file.c index bb003dc9ffff..2e8322c8aa88 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -210,6 +210,7 @@ const struct file_operations ext4_file_operations = { .fsync = ext4_sync_file, .splice_read = generic_file_splice_read, .splice_write = generic_file_splice_write, + .fallocate = ext4_fallocate, }; const struct inode_operations ext4_file_inode_operations = { @@ -223,7 +224,6 @@ const struct inode_operations ext4_file_inode_operations = { .removexattr = generic_removexattr, #endif .check_acl = ext4_check_acl, - .fallocate = ext4_fallocate, .fiemap = ext4_fiemap, }; diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c index fca6689e12e6..7cfdcb913363 100644 --- a/fs/gfs2/file.c +++ b/fs/gfs2/file.c @@ -19,6 +19,8 @@ #include #include #include +#include +#include #include #include #include @@ -610,6 +612,260 @@ static ssize_t gfs2_file_aio_write(struct kiocb *iocb, const struct iovec *iov, return generic_file_aio_write(iocb, iov, nr_segs, pos); } +static void empty_write_end(struct page *page, unsigned from, + unsigned to) +{ + struct gfs2_inode *ip = GFS2_I(page->mapping->host); + + page_zero_new_buffers(page, from, to); + flush_dcache_page(page); + mark_page_accessed(page); + + if (!gfs2_is_writeback(ip)) + gfs2_page_add_databufs(ip, page, from, to); + + block_commit_write(page, from, to); +} + +static int write_empty_blocks(struct page *page, unsigned from, unsigned to) +{ + unsigned start, end, next; + struct buffer_head *bh, *head; + int error; + + if (!page_has_buffers(page)) { + error = __block_write_begin(page, from, to - from, gfs2_block_map); + if (unlikely(error)) + return error; + + empty_write_end(page, from, to); + return 0; + } + + bh = head = page_buffers(page); + next = end = 0; + while (next < from) { + next += bh->b_size; + bh = bh->b_this_page; + } + start = next; + do { + next += bh->b_size; + if (buffer_mapped(bh)) { + if (end) { + error = __block_write_begin(page, start, end - start, + gfs2_block_map); + if (unlikely(error)) + return error; + empty_write_end(page, start, end); + end = 0; + } + start = next; + } + else + end = next; + bh = bh->b_this_page; + } while (next < to); + + if (end) { + error = __block_write_begin(page, start, end - start, gfs2_block_map); + if (unlikely(error)) + return error; + empty_write_end(page, start, end); + } + + return 0; +} + +static int fallocate_chunk(struct inode *inode, loff_t offset, loff_t len, + int mode) +{ + struct gfs2_inode *ip = GFS2_I(inode); + struct buffer_head *dibh; + int error; + u64 start = offset >> PAGE_CACHE_SHIFT; + unsigned int start_offset = offset & ~PAGE_CACHE_MASK; + u64 end = (offset + len - 1) >> PAGE_CACHE_SHIFT; + pgoff_t curr; + struct page *page; + unsigned int end_offset = (offset + len) & ~PAGE_CACHE_MASK; + unsigned int from, to; + + if (!end_offset) + end_offset = PAGE_CACHE_SIZE; + + error = gfs2_meta_inode_buffer(ip, &dibh); + if (unlikely(error)) + goto out; + + gfs2_trans_add_bh(ip->i_gl, dibh, 1); + + if (gfs2_is_stuffed(ip)) { + error = gfs2_unstuff_dinode(ip, NULL); + if (unlikely(error)) + goto out; + } + + curr = start; + offset = start << PAGE_CACHE_SHIFT; + from = start_offset; + to = PAGE_CACHE_SIZE; + while (curr <= end) { + page = grab_cache_page_write_begin(inode->i_mapping, curr, + AOP_FLAG_NOFS); + if (unlikely(!page)) { + error = -ENOMEM; + goto out; + } + + if (curr == end) + to = end_offset; + error = write_empty_blocks(page, from, to); + if (!error && offset + to > inode->i_size && + !(mode & FALLOC_FL_KEEP_SIZE)) { + i_size_write(inode, offset + to); + } + unlock_page(page); + page_cache_release(page); + if (error) + goto out; + curr++; + offset += PAGE_CACHE_SIZE; + from = 0; + } + + gfs2_dinode_out(ip, dibh->b_data); + mark_inode_dirty(inode); + + brelse(dibh); + +out: + return error; +} + +static void calc_max_reserv(struct gfs2_inode *ip, loff_t max, loff_t *len, + unsigned int *data_blocks, unsigned int *ind_blocks) +{ + const struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode); + unsigned int max_blocks = ip->i_alloc->al_rgd->rd_free_clone; + unsigned int tmp, max_data = max_blocks - 3 * (sdp->sd_max_height - 1); + + for (tmp = max_data; tmp > sdp->sd_diptrs;) { + tmp = DIV_ROUND_UP(tmp, sdp->sd_inptrs); + max_data -= tmp; + } + /* This calculation isn't the exact reverse of gfs2_write_calc_reserve, + so it might end up with fewer data blocks */ + if (max_data <= *data_blocks) + return; + *data_blocks = max_data; + *ind_blocks = max_blocks - max_data; + *len = ((loff_t)max_data - 3) << sdp->sd_sb.sb_bsize_shift; + if (*len > max) { + *len = max; + gfs2_write_calc_reserv(ip, max, data_blocks, ind_blocks); + } +} + +static long gfs2_fallocate(struct file *file, int mode, loff_t offset, + loff_t len) +{ + struct inode *inode = file->f_path.dentry->d_inode; + struct gfs2_sbd *sdp = GFS2_SB(inode); + struct gfs2_inode *ip = GFS2_I(inode); + unsigned int data_blocks = 0, ind_blocks = 0, rblocks; + loff_t bytes, max_bytes; + struct gfs2_alloc *al; + int error; + loff_t next = (offset + len - 1) >> sdp->sd_sb.sb_bsize_shift; + next = (next + 1) << sdp->sd_sb.sb_bsize_shift; + + /* We only support the FALLOC_FL_KEEP_SIZE mode */ + if (mode & ~FALLOC_FL_KEEP_SIZE) + return -EOPNOTSUPP; + + offset = (offset >> sdp->sd_sb.sb_bsize_shift) << + sdp->sd_sb.sb_bsize_shift; + + len = next - offset; + bytes = sdp->sd_max_rg_data * sdp->sd_sb.sb_bsize / 2; + if (!bytes) + bytes = UINT_MAX; + + gfs2_holder_init(ip->i_gl, LM_ST_EXCLUSIVE, 0, &ip->i_gh); + error = gfs2_glock_nq(&ip->i_gh); + if (unlikely(error)) + goto out_uninit; + + if (!gfs2_write_alloc_required(ip, offset, len)) + goto out_unlock; + + while (len > 0) { + if (len < bytes) + bytes = len; + al = gfs2_alloc_get(ip); + if (!al) { + error = -ENOMEM; + goto out_unlock; + } + + error = gfs2_quota_lock_check(ip); + if (error) + goto out_alloc_put; + +retry: + gfs2_write_calc_reserv(ip, bytes, &data_blocks, &ind_blocks); + + al->al_requested = data_blocks + ind_blocks; + error = gfs2_inplace_reserve(ip); + if (error) { + if (error == -ENOSPC && bytes > sdp->sd_sb.sb_bsize) { + bytes >>= 1; + goto retry; + } + goto out_qunlock; + } + max_bytes = bytes; + calc_max_reserv(ip, len, &max_bytes, &data_blocks, &ind_blocks); + al->al_requested = data_blocks + ind_blocks; + + rblocks = RES_DINODE + ind_blocks + RES_STATFS + RES_QUOTA + + RES_RG_HDR + gfs2_rg_blocks(al); + if (gfs2_is_jdata(ip)) + rblocks += data_blocks ? data_blocks : 1; + + error = gfs2_trans_begin(sdp, rblocks, + PAGE_CACHE_SIZE/sdp->sd_sb.sb_bsize); + if (error) + goto out_trans_fail; + + error = fallocate_chunk(inode, offset, max_bytes, mode); + gfs2_trans_end(sdp); + + if (error) + goto out_trans_fail; + + len -= max_bytes; + offset += max_bytes; + gfs2_inplace_release(ip); + gfs2_quota_unlock(ip); + gfs2_alloc_put(ip); + } + goto out_unlock; + +out_trans_fail: + gfs2_inplace_release(ip); +out_qunlock: + gfs2_quota_unlock(ip); +out_alloc_put: + gfs2_alloc_put(ip); +out_unlock: + gfs2_glock_dq(&ip->i_gh); +out_uninit: + gfs2_holder_uninit(&ip->i_gh); + return error; +} + #ifdef CONFIG_GFS2_FS_LOCKING_DLM /** @@ -765,6 +1021,7 @@ const struct file_operations gfs2_file_fops = { .splice_read = generic_file_splice_read, .splice_write = generic_file_splice_write, .setlease = gfs2_setlease, + .fallocate = gfs2_fallocate, }; const struct file_operations gfs2_dir_fops = { @@ -794,6 +1051,7 @@ const struct file_operations gfs2_file_fops_nolock = { .splice_read = generic_file_splice_read, .splice_write = generic_file_splice_write, .setlease = generic_setlease, + .fallocate = gfs2_fallocate, }; const struct file_operations gfs2_dir_fops_nolock = { diff --git a/fs/gfs2/ops_inode.c b/fs/gfs2/ops_inode.c index c09528c07f3d..d8b26ac2e20b 100644 --- a/fs/gfs2/ops_inode.c +++ b/fs/gfs2/ops_inode.c @@ -18,8 +18,6 @@ #include #include #include -#include -#include #include #include "gfs2.h" @@ -1257,261 +1255,6 @@ static int gfs2_removexattr(struct dentry *dentry, const char *name) return ret; } -static void empty_write_end(struct page *page, unsigned from, - unsigned to) -{ - struct gfs2_inode *ip = GFS2_I(page->mapping->host); - - page_zero_new_buffers(page, from, to); - flush_dcache_page(page); - mark_page_accessed(page); - - if (!gfs2_is_writeback(ip)) - gfs2_page_add_databufs(ip, page, from, to); - - block_commit_write(page, from, to); -} - - -static int write_empty_blocks(struct page *page, unsigned from, unsigned to) -{ - unsigned start, end, next; - struct buffer_head *bh, *head; - int error; - - if (!page_has_buffers(page)) { - error = __block_write_begin(page, from, to - from, gfs2_block_map); - if (unlikely(error)) - return error; - - empty_write_end(page, from, to); - return 0; - } - - bh = head = page_buffers(page); - next = end = 0; - while (next < from) { - next += bh->b_size; - bh = bh->b_this_page; - } - start = next; - do { - next += bh->b_size; - if (buffer_mapped(bh)) { - if (end) { - error = __block_write_begin(page, start, end - start, - gfs2_block_map); - if (unlikely(error)) - return error; - empty_write_end(page, start, end); - end = 0; - } - start = next; - } - else - end = next; - bh = bh->b_this_page; - } while (next < to); - - if (end) { - error = __block_write_begin(page, start, end - start, gfs2_block_map); - if (unlikely(error)) - return error; - empty_write_end(page, start, end); - } - - return 0; -} - -static int fallocate_chunk(struct inode *inode, loff_t offset, loff_t len, - int mode) -{ - struct gfs2_inode *ip = GFS2_I(inode); - struct buffer_head *dibh; - int error; - u64 start = offset >> PAGE_CACHE_SHIFT; - unsigned int start_offset = offset & ~PAGE_CACHE_MASK; - u64 end = (offset + len - 1) >> PAGE_CACHE_SHIFT; - pgoff_t curr; - struct page *page; - unsigned int end_offset = (offset + len) & ~PAGE_CACHE_MASK; - unsigned int from, to; - - if (!end_offset) - end_offset = PAGE_CACHE_SIZE; - - error = gfs2_meta_inode_buffer(ip, &dibh); - if (unlikely(error)) - goto out; - - gfs2_trans_add_bh(ip->i_gl, dibh, 1); - - if (gfs2_is_stuffed(ip)) { - error = gfs2_unstuff_dinode(ip, NULL); - if (unlikely(error)) - goto out; - } - - curr = start; - offset = start << PAGE_CACHE_SHIFT; - from = start_offset; - to = PAGE_CACHE_SIZE; - while (curr <= end) { - page = grab_cache_page_write_begin(inode->i_mapping, curr, - AOP_FLAG_NOFS); - if (unlikely(!page)) { - error = -ENOMEM; - goto out; - } - - if (curr == end) - to = end_offset; - error = write_empty_blocks(page, from, to); - if (!error && offset + to > inode->i_size && - !(mode & FALLOC_FL_KEEP_SIZE)) { - i_size_write(inode, offset + to); - } - unlock_page(page); - page_cache_release(page); - if (error) - goto out; - curr++; - offset += PAGE_CACHE_SIZE; - from = 0; - } - - gfs2_dinode_out(ip, dibh->b_data); - mark_inode_dirty(inode); - - brelse(dibh); - -out: - return error; -} - -static void calc_max_reserv(struct gfs2_inode *ip, loff_t max, loff_t *len, - unsigned int *data_blocks, unsigned int *ind_blocks) -{ - const struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode); - unsigned int max_blocks = ip->i_alloc->al_rgd->rd_free_clone; - unsigned int tmp, max_data = max_blocks - 3 * (sdp->sd_max_height - 1); - - for (tmp = max_data; tmp > sdp->sd_diptrs;) { - tmp = DIV_ROUND_UP(tmp, sdp->sd_inptrs); - max_data -= tmp; - } - /* This calculation isn't the exact reverse of gfs2_write_calc_reserve, - so it might end up with fewer data blocks */ - if (max_data <= *data_blocks) - return; - *data_blocks = max_data; - *ind_blocks = max_blocks - max_data; - *len = ((loff_t)max_data - 3) << sdp->sd_sb.sb_bsize_shift; - if (*len > max) { - *len = max; - gfs2_write_calc_reserv(ip, max, data_blocks, ind_blocks); - } -} - -static long gfs2_fallocate(struct inode *inode, int mode, loff_t offset, - loff_t len) -{ - struct gfs2_sbd *sdp = GFS2_SB(inode); - struct gfs2_inode *ip = GFS2_I(inode); - unsigned int data_blocks = 0, ind_blocks = 0, rblocks; - loff_t bytes, max_bytes; - struct gfs2_alloc *al; - int error; - loff_t next = (offset + len - 1) >> sdp->sd_sb.sb_bsize_shift; - next = (next + 1) << sdp->sd_sb.sb_bsize_shift; - - /* We only support the FALLOC_FL_KEEP_SIZE mode */ - if (mode & ~FALLOC_FL_KEEP_SIZE) - return -EOPNOTSUPP; - - offset = (offset >> sdp->sd_sb.sb_bsize_shift) << - sdp->sd_sb.sb_bsize_shift; - - len = next - offset; - bytes = sdp->sd_max_rg_data * sdp->sd_sb.sb_bsize / 2; - if (!bytes) - bytes = UINT_MAX; - - gfs2_holder_init(ip->i_gl, LM_ST_EXCLUSIVE, 0, &ip->i_gh); - error = gfs2_glock_nq(&ip->i_gh); - if (unlikely(error)) - goto out_uninit; - - if (!gfs2_write_alloc_required(ip, offset, len)) - goto out_unlock; - - while (len > 0) { - if (len < bytes) - bytes = len; - al = gfs2_alloc_get(ip); - if (!al) { - error = -ENOMEM; - goto out_unlock; - } - - error = gfs2_quota_lock_check(ip); - if (error) - goto out_alloc_put; - -retry: - gfs2_write_calc_reserv(ip, bytes, &data_blocks, &ind_blocks); - - al->al_requested = data_blocks + ind_blocks; - error = gfs2_inplace_reserve(ip); - if (error) { - if (error == -ENOSPC && bytes > sdp->sd_sb.sb_bsize) { - bytes >>= 1; - goto retry; - } - goto out_qunlock; - } - max_bytes = bytes; - calc_max_reserv(ip, len, &max_bytes, &data_blocks, &ind_blocks); - al->al_requested = data_blocks + ind_blocks; - - rblocks = RES_DINODE + ind_blocks + RES_STATFS + RES_QUOTA + - RES_RG_HDR + gfs2_rg_blocks(al); - if (gfs2_is_jdata(ip)) - rblocks += data_blocks ? data_blocks : 1; - - error = gfs2_trans_begin(sdp, rblocks, - PAGE_CACHE_SIZE/sdp->sd_sb.sb_bsize); - if (error) - goto out_trans_fail; - - error = fallocate_chunk(inode, offset, max_bytes, mode); - gfs2_trans_end(sdp); - - if (error) - goto out_trans_fail; - - len -= max_bytes; - offset += max_bytes; - gfs2_inplace_release(ip); - gfs2_quota_unlock(ip); - gfs2_alloc_put(ip); - } - goto out_unlock; - -out_trans_fail: - gfs2_inplace_release(ip); -out_qunlock: - gfs2_quota_unlock(ip); -out_alloc_put: - gfs2_alloc_put(ip); -out_unlock: - gfs2_glock_dq(&ip->i_gh); -out_uninit: - gfs2_holder_uninit(&ip->i_gh); - return error; -} - - static int gfs2_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, u64 start, u64 len) { @@ -1562,7 +1305,6 @@ const struct inode_operations gfs2_file_iops = { .getxattr = gfs2_getxattr, .listxattr = gfs2_listxattr, .removexattr = gfs2_removexattr, - .fallocate = gfs2_fallocate, .fiemap = gfs2_fiemap, }; diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c index cf254ce8c941..a6651956482e 100644 --- a/fs/ocfs2/file.c +++ b/fs/ocfs2/file.c @@ -1989,9 +1989,10 @@ int ocfs2_change_file_space(struct file *file, unsigned int cmd, return __ocfs2_change_file_space(file, inode, file->f_pos, cmd, sr, 0); } -static long ocfs2_fallocate(struct inode *inode, int mode, loff_t offset, +static long ocfs2_fallocate(struct file *file, int mode, loff_t offset, loff_t len) { + struct inode *inode = file->f_path.dentry->d_inode; struct ocfs2_super *osb = OCFS2_SB(inode->i_sb); struct ocfs2_space_resv sr; int change_size = 1; @@ -2002,9 +2003,6 @@ static long ocfs2_fallocate(struct inode *inode, int mode, loff_t offset, if (!ocfs2_writes_unwritten_extents(osb)) return -EOPNOTSUPP; - if (S_ISDIR(inode->i_mode)) - return -ENODEV; - if (mode & FALLOC_FL_KEEP_SIZE) change_size = 0; @@ -2612,7 +2610,6 @@ const struct inode_operations ocfs2_file_iops = { .getxattr = generic_getxattr, .listxattr = ocfs2_listxattr, .removexattr = generic_removexattr, - .fallocate = ocfs2_fallocate, .fiemap = ocfs2_fiemap, }; @@ -2644,6 +2641,7 @@ const struct file_operations ocfs2_fops = { .flock = ocfs2_flock, .splice_read = ocfs2_file_splice_read, .splice_write = ocfs2_file_splice_write, + .fallocate = ocfs2_fallocate, }; const struct file_operations ocfs2_dops = { diff --git a/fs/open.c b/fs/open.c index 5b6ef7e2859e..e52389e1f05b 100644 --- a/fs/open.c +++ b/fs/open.c @@ -255,10 +255,10 @@ int do_fallocate(struct file *file, int mode, loff_t offset, loff_t len) if (((offset + len) > inode->i_sb->s_maxbytes) || ((offset + len) < 0)) return -EFBIG; - if (!inode->i_op->fallocate) + if (!file->f_op->fallocate) return -EOPNOTSUPP; - return inode->i_op->fallocate(inode, mode, offset, len); + return file->f_op->fallocate(file, mode, offset, len); } SYSCALL_DEFINE(fallocate)(int fd, int mode, loff_t offset, loff_t len) diff --git a/fs/xfs/linux-2.6/xfs_file.c b/fs/xfs/linux-2.6/xfs_file.c index ef51eb43e137..a55c1b46b219 100644 --- a/fs/xfs/linux-2.6/xfs_file.c +++ b/fs/xfs/linux-2.6/xfs_file.c @@ -37,6 +37,7 @@ #include "xfs_trace.h" #include +#include static const struct vm_operations_struct xfs_file_vm_ops; @@ -882,6 +883,60 @@ out_unlock: return ret; } +STATIC long +xfs_file_fallocate( + struct file *file, + int mode, + loff_t offset, + loff_t len) +{ + struct inode *inode = file->f_path.dentry->d_inode; + long error; + loff_t new_size = 0; + xfs_flock64_t bf; + xfs_inode_t *ip = XFS_I(inode); + int cmd = XFS_IOC_RESVSP; + + if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)) + return -EOPNOTSUPP; + + bf.l_whence = 0; + bf.l_start = offset; + bf.l_len = len; + + xfs_ilock(ip, XFS_IOLOCK_EXCL); + + if (mode & FALLOC_FL_PUNCH_HOLE) + cmd = XFS_IOC_UNRESVSP; + + /* check the new inode size is valid before allocating */ + if (!(mode & FALLOC_FL_KEEP_SIZE) && + offset + len > i_size_read(inode)) { + new_size = offset + len; + error = inode_newsize_ok(inode, new_size); + if (error) + goto out_unlock; + } + + error = -xfs_change_file_space(ip, cmd, &bf, 0, XFS_ATTR_NOLOCK); + if (error) + goto out_unlock; + + /* Change file size if needed */ + if (new_size) { + struct iattr iattr; + + iattr.ia_valid = ATTR_SIZE; + iattr.ia_size = new_size; + error = -xfs_setattr(ip, &iattr, XFS_ATTR_NOLOCK); + } + +out_unlock: + xfs_iunlock(ip, XFS_IOLOCK_EXCL); + return error; +} + + STATIC int xfs_file_open( struct inode *inode, @@ -1000,6 +1055,7 @@ const struct file_operations xfs_file_operations = { .open = xfs_file_open, .release = xfs_file_release, .fsync = xfs_file_fsync, + .fallocate = xfs_file_fallocate, }; const struct file_operations xfs_dir_file_operations = { diff --git a/fs/xfs/linux-2.6/xfs_iops.c b/fs/xfs/linux-2.6/xfs_iops.c index a4ecc2188a09..bd5727852fd6 100644 --- a/fs/xfs/linux-2.6/xfs_iops.c +++ b/fs/xfs/linux-2.6/xfs_iops.c @@ -46,7 +46,6 @@ #include #include #include -#include #include #include @@ -505,64 +504,6 @@ xfs_vn_setattr( return -xfs_setattr(XFS_I(dentry->d_inode), iattr, 0); } -STATIC long -xfs_vn_fallocate( - struct inode *inode, - int mode, - loff_t offset, - loff_t len) -{ - long error; - loff_t new_size = 0; - xfs_flock64_t bf; - xfs_inode_t *ip = XFS_I(inode); - int cmd = XFS_IOC_RESVSP; - - if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)) - return -EOPNOTSUPP; - - /* preallocation on directories not yet supported */ - error = -ENODEV; - if (S_ISDIR(inode->i_mode)) - goto out_error; - - bf.l_whence = 0; - bf.l_start = offset; - bf.l_len = len; - - xfs_ilock(ip, XFS_IOLOCK_EXCL); - - if (mode & FALLOC_FL_PUNCH_HOLE) - cmd = XFS_IOC_UNRESVSP; - - /* check the new inode size is valid before allocating */ - if (!(mode & FALLOC_FL_KEEP_SIZE) && - offset + len > i_size_read(inode)) { - new_size = offset + len; - error = inode_newsize_ok(inode, new_size); - if (error) - goto out_unlock; - } - - error = -xfs_change_file_space(ip, cmd, &bf, 0, XFS_ATTR_NOLOCK); - if (error) - goto out_unlock; - - /* Change file size if needed */ - if (new_size) { - struct iattr iattr; - - iattr.ia_valid = ATTR_SIZE; - iattr.ia_size = new_size; - error = -xfs_setattr(ip, &iattr, XFS_ATTR_NOLOCK); - } - -out_unlock: - xfs_iunlock(ip, XFS_IOLOCK_EXCL); -out_error: - return error; -} - #define XFS_FIEMAP_FLAGS (FIEMAP_FLAG_SYNC|FIEMAP_FLAG_XATTR) /* @@ -656,7 +597,6 @@ static const struct inode_operations xfs_inode_operations = { .getxattr = generic_getxattr, .removexattr = generic_removexattr, .listxattr = xfs_vn_listxattr, - .fallocate = xfs_vn_fallocate, .fiemap = xfs_vn_fiemap, }; diff --git a/include/linux/fs.h b/include/linux/fs.h index 177b4ddea418..09b5bd6a7c6b 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1552,6 +1552,8 @@ struct file_operations { ssize_t (*splice_write)(struct pipe_inode_info *, struct file *, loff_t *, size_t, unsigned int); ssize_t (*splice_read)(struct file *, loff_t *, struct pipe_inode_info *, size_t, unsigned int); int (*setlease)(struct file *, long, struct file_lock **); + long (*fallocate)(struct file *file, int mode, loff_t offset, + loff_t len); }; #define IPERM_FLAG_RCU 0x0001 @@ -1582,8 +1584,6 @@ struct inode_operations { ssize_t (*listxattr) (struct dentry *, char *, size_t); int (*removexattr) (struct dentry *, const char *); void (*truncate_range)(struct inode *, loff_t, loff_t); - long (*fallocate)(struct inode *inode, int mode, loff_t offset, - loff_t len); int (*fiemap)(struct inode *, struct fiemap_extent_info *, u64 start, u64 len); } ____cacheline_aligned; -- cgit v1.2.3 From 379c4bf1d6e184ecb8ff72f83b7c81588cfa18f8 Mon Sep 17 00:00:00 2001 From: Seungwhan Youn Date: Thu, 13 Jan 2011 11:08:21 +0900 Subject: ASoC: documentation updates This patch is only for RFC purpose of ASoC documentation updates which match with current ASoC codes with documents. Mostly modify features are modified to be sync with changes after multi-component patches. Signed-off-by: Seungwhan Youn Acked-by: Liam Girdwood Signed-off-by: Mark Brown --- Documentation/sound/alsa/soc/codec.txt | 45 +++++++++++++++---------------- Documentation/sound/alsa/soc/machine.txt | 38 +++++++------------------- Documentation/sound/alsa/soc/platform.txt | 12 +++++++-- 3 files changed, 40 insertions(+), 55 deletions(-) (limited to 'Documentation') diff --git a/Documentation/sound/alsa/soc/codec.txt b/Documentation/sound/alsa/soc/codec.txt index 37ba3a72cb76..bce23a4a7875 100644 --- a/Documentation/sound/alsa/soc/codec.txt +++ b/Documentation/sound/alsa/soc/codec.txt @@ -27,42 +27,38 @@ ASoC Codec driver breakdown 1 - Codec DAI and PCM configuration ----------------------------------- -Each codec driver must have a struct snd_soc_codec_dai to define its DAI and +Each codec driver must have a struct snd_soc_dai_driver to define its DAI and PCM capabilities and operations. This struct is exported so that it can be registered with the core by your machine driver. e.g. -struct snd_soc_codec_dai wm8731_dai = { - .name = "WM8731", - /* playback capabilities */ +static struct snd_soc_dai_ops wm8731_dai_ops = { + .prepare = wm8731_pcm_prepare, + .hw_params = wm8731_hw_params, + .shutdown = wm8731_shutdown, + .digital_mute = wm8731_mute, + .set_sysclk = wm8731_set_dai_sysclk, + .set_fmt = wm8731_set_dai_fmt, +}; + +struct snd_soc_dai_driver wm8731_dai = { + .name = "wm8731-hifi", .playback = { .stream_name = "Playback", .channels_min = 1, .channels_max = 2, .rates = WM8731_RATES, .formats = WM8731_FORMATS,}, - /* capture capabilities */ .capture = { .stream_name = "Capture", .channels_min = 1, .channels_max = 2, .rates = WM8731_RATES, .formats = WM8731_FORMATS,}, - /* pcm operations - see section 4 below */ - .ops = { - .prepare = wm8731_pcm_prepare, - .hw_params = wm8731_hw_params, - .shutdown = wm8731_shutdown, - }, - /* DAI operations - see DAI.txt */ - .dai_ops = { - .digital_mute = wm8731_mute, - .set_sysclk = wm8731_set_dai_sysclk, - .set_fmt = wm8731_set_dai_fmt, - } + .ops = &wm8731_dai_ops, + .symmetric_rates = 1, }; -EXPORT_SYMBOL_GPL(wm8731_dai); 2 - Codec control IO @@ -186,13 +182,14 @@ when the mute is applied or freed. i.e. -static int wm8974_mute(struct snd_soc_codec *codec, - struct snd_soc_codec_dai *dai, int mute) +static int wm8974_mute(struct snd_soc_dai *dai, int mute) { - u16 mute_reg = wm8974_read_reg_cache(codec, WM8974_DAC) & 0xffbf; - if(mute) - wm8974_write(codec, WM8974_DAC, mute_reg | 0x40); + struct snd_soc_codec *codec = dai->codec; + u16 mute_reg = snd_soc_read(codec, WM8974_DAC) & 0xffbf; + + if (mute) + snd_soc_write(codec, WM8974_DAC, mute_reg | 0x40); else - wm8974_write(codec, WM8974_DAC, mute_reg); + snd_soc_write(codec, WM8974_DAC, mute_reg); return 0; } diff --git a/Documentation/sound/alsa/soc/machine.txt b/Documentation/sound/alsa/soc/machine.txt index 2524c75557df..3e2ec9cbf397 100644 --- a/Documentation/sound/alsa/soc/machine.txt +++ b/Documentation/sound/alsa/soc/machine.txt @@ -12,6 +12,8 @@ the following struct:- struct snd_soc_card { char *name; + ... + int (*probe)(struct platform_device *pdev); int (*remove)(struct platform_device *pdev); @@ -22,12 +24,13 @@ struct snd_soc_card { int (*resume_pre)(struct platform_device *pdev); int (*resume_post)(struct platform_device *pdev); - /* machine stream operations */ - struct snd_soc_ops *ops; + ... /* CPU <--> Codec DAI links */ struct snd_soc_dai_link *dai_link; int num_links; + + ... }; probe()/remove() @@ -42,11 +45,6 @@ of any machine audio tasks that have to be done before or after the codec, DAIs and DMA is suspended and resumed. Optional. -Machine operations ------------------- -The machine specific audio operations can be set here. Again this is optional. - - Machine DAI Configuration ------------------------- The machine DAI configuration glues all the codec and CPU DAIs together. It can @@ -61,8 +59,10 @@ struct snd_soc_dai_link is used to set up each DAI in your machine. e.g. static struct snd_soc_dai_link corgi_dai = { .name = "WM8731", .stream_name = "WM8731", - .cpu_dai = &pxa_i2s_dai, - .codec_dai = &wm8731_dai, + .cpu_dai_name = "pxa-is2-dai", + .codec_dai_name = "wm8731-hifi", + .platform_name = "pxa-pcm-audio", + .codec_name = "wm8713-codec.0-001a", .init = corgi_wm8731_init, .ops = &corgi_ops, }; @@ -77,26 +77,6 @@ static struct snd_soc_card snd_soc_corgi = { }; -Machine Audio Subsystem ------------------------ - -The machine soc device glues the platform, machine and codec driver together. -Private data can also be set here. e.g. - -/* corgi audio private data */ -static struct wm8731_setup_data corgi_wm8731_setup = { - .i2c_address = 0x1b, -}; - -/* corgi audio subsystem */ -static struct snd_soc_device corgi_snd_devdata = { - .machine = &snd_soc_corgi, - .platform = &pxa2xx_soc_platform, - .codec_dev = &soc_codec_dev_wm8731, - .codec_data = &corgi_wm8731_setup, -}; - - Machine Power Map ----------------- diff --git a/Documentation/sound/alsa/soc/platform.txt b/Documentation/sound/alsa/soc/platform.txt index 06d835987c6a..d57efad37e0a 100644 --- a/Documentation/sound/alsa/soc/platform.txt +++ b/Documentation/sound/alsa/soc/platform.txt @@ -20,9 +20,10 @@ struct snd_soc_ops { int (*trigger)(struct snd_pcm_substream *, int); }; -The platform driver exports its DMA functionality via struct snd_soc_platform:- +The platform driver exports its DMA functionality via struct +snd_soc_platform_driver:- -struct snd_soc_platform { +struct snd_soc_platform_driver { char *name; int (*probe)(struct platform_device *pdev); @@ -34,6 +35,13 @@ struct snd_soc_platform { int (*pcm_new)(struct snd_card *, struct snd_soc_codec_dai *, struct snd_pcm *); void (*pcm_free)(struct snd_pcm *); + /* + * For platform caused delay reporting. + * Optional. + */ + snd_pcm_sframes_t (*delay)(struct snd_pcm_substream *, + struct snd_soc_dai *); + /* platform stream ops */ struct snd_pcm_ops *pcm_ops; }; -- cgit v1.2.3 From c7bf71c517abfc3b15970d67910e0f62e0522939 Mon Sep 17 00:00:00 2001 From: Guenter Roeck Date: Mon, 17 Jan 2011 12:48:20 -0800 Subject: hwmon: (lm93) Add support for LM94 This patch adds basic support for LM94 to the LM93 driver. LM94 specific sensors and features are not supported. Signed-off-by: Guenter Roeck Acked-by: Jean Delvare --- Documentation/hwmon/lm93 | 7 +++++++ drivers/hwmon/Kconfig | 4 ++-- drivers/hwmon/lm93.c | 21 +++++++++++++++++++-- 3 files changed, 28 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/hwmon/lm93 b/Documentation/hwmon/lm93 index 7a10616d0b44..f3b2ad2ceb01 100644 --- a/Documentation/hwmon/lm93 +++ b/Documentation/hwmon/lm93 @@ -6,6 +6,10 @@ Supported chips: Prefix 'lm93' Addresses scanned: I2C 0x2c-0x2e Datasheet: http://www.national.com/ds.cgi/LM/LM93.pdf + * National Semiconductor LM94 + Prefix 'lm94' + Addresses scanned: I2C 0x2c-0x2e + Datasheet: http://www.national.com/ds.cgi/LM/LM94.pdf Authors: Mark M. Hoffman @@ -56,6 +60,9 @@ previous motherboard management ASICs and uses some of the LM85's features for dynamic Vccp monitoring and PROCHOT. It is designed to monitor a dual processor Xeon class motherboard with a minimum of external components. +LM94 is also supported in LM93 compatible mode. Extra sensors and features of +LM94 are not supported. + User Interface -------------- diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig index 35f00dae3676..773e484f1646 100644 --- a/drivers/hwmon/Kconfig +++ b/drivers/hwmon/Kconfig @@ -618,8 +618,8 @@ config SENSORS_LM93 depends on I2C select HWMON_VID help - If you say yes here you get support for National Semiconductor LM93 - sensor chips. + If you say yes here you get support for National Semiconductor LM93, + LM94, and compatible sensor chips. This driver can also be built as a module. If so, the module will be called lm93. diff --git a/drivers/hwmon/lm93.c b/drivers/hwmon/lm93.c index c9ed14eba5a6..3b43df418613 100644 --- a/drivers/hwmon/lm93.c +++ b/drivers/hwmon/lm93.c @@ -135,6 +135,11 @@ #define LM93_MFR_ID 0x73 #define LM93_MFR_ID_PROTOTYPE 0x72 +/* LM94 REGISTER VALUES */ +#define LM94_MFR_ID_2 0x7a +#define LM94_MFR_ID 0x79 +#define LM94_MFR_ID_PROTOTYPE 0x78 + /* SMBus capabilities */ #define LM93_SMBUS_FUNC_FULL (I2C_FUNC_SMBUS_BYTE_DATA | \ I2C_FUNC_SMBUS_WORD_DATA | I2C_FUNC_SMBUS_BLOCK_DATA) @@ -2504,6 +2509,7 @@ static int lm93_detect(struct i2c_client *client, struct i2c_board_info *info) { struct i2c_adapter *adapter = client->adapter; int mfr, ver; + const char *name; if (!i2c_check_functionality(adapter, LM93_SMBUS_FUNC_MIN)) return -ENODEV; @@ -2517,13 +2523,23 @@ static int lm93_detect(struct i2c_client *client, struct i2c_board_info *info) } ver = lm93_read_byte(client, LM93_REG_VER); - if (ver != LM93_MFR_ID && ver != LM93_MFR_ID_PROTOTYPE) { + switch (ver) { + case LM93_MFR_ID: + case LM93_MFR_ID_PROTOTYPE: + name = "lm93"; + break; + case LM94_MFR_ID_2: + case LM94_MFR_ID: + case LM94_MFR_ID_PROTOTYPE: + name = "lm94"; + break; + default: dev_dbg(&adapter->dev, "detect failed, bad version id 0x%02x!\n", ver); return -ENODEV; } - strlcpy(info->type, "lm93", I2C_NAME_SIZE); + strlcpy(info->type, name, I2C_NAME_SIZE); dev_dbg(&adapter->dev,"loading %s at %d,0x%02x\n", client->name, i2c_adapter_id(client->adapter), client->addr); @@ -2602,6 +2618,7 @@ static int lm93_remove(struct i2c_client *client) static const struct i2c_device_id lm93_id[] = { { "lm93", 0 }, + { "lm94", 0 }, { } }; MODULE_DEVICE_TABLE(i2c, lm93_id); -- cgit v1.2.3 From 2a863793beaa0fc9ee7aeb87efe85544a6b129c0 Mon Sep 17 00:00:00 2001 From: Hans Verkuil Date: Tue, 11 Jan 2011 14:45:03 -0300 Subject: [media] v4l2-ctrls: v4l2_ctrl_handler_setup must set is_new to 1 Renamed has_new to is_new. Drivers can use the is_new field to determine if a new value was specified for a control. The v4l2_ctrl_handler_setup() must always set this to 1 since the setup has to force a full update of all controls. Signed-off-by: Hans Verkuil Acked-by: Laurent Pinchart Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/v4l2-controls.txt | 12 ++++++++++++ drivers/media/video/v4l2-ctrls.c | 24 ++++++++++++++---------- include/media/v4l2-ctrls.h | 6 ++++-- 3 files changed, 30 insertions(+), 12 deletions(-) (limited to 'Documentation') diff --git a/Documentation/video4linux/v4l2-controls.txt b/Documentation/video4linux/v4l2-controls.txt index 8773778d23fc..881e7f44491b 100644 --- a/Documentation/video4linux/v4l2-controls.txt +++ b/Documentation/video4linux/v4l2-controls.txt @@ -285,6 +285,9 @@ implement g_volatile_ctrl like this: The 'new value' union is not used in g_volatile_ctrl. In general controls that need to implement g_volatile_ctrl are read-only controls. +Note that if one or more controls in a control cluster are marked as volatile, +then all the controls in the cluster are seen as volatile. + To mark a control as volatile you have to set the is_volatile flag: ctrl = v4l2_ctrl_new_std(&sd->ctrl_handler, ...); @@ -462,6 +465,15 @@ pointer to the v4l2_ctrl_ops struct that is used for that cluster. Obviously, all controls in the cluster array must be initialized to either a valid control or to NULL. +In rare cases you might want to know which controls of a cluster actually +were set explicitly by the user. For this you can check the 'is_new' flag of +each control. For example, in the case of a volume/mute cluster the 'is_new' +flag of the mute control would be set if the user called VIDIOC_S_CTRL for +mute only. If the user would call VIDIOC_S_EXT_CTRLS for both mute and volume +controls, then the 'is_new' flag would be 1 for both controls. + +The 'is_new' flag is always 1 when called from v4l2_ctrl_handler_setup(). + VIDIOC_LOG_STATUS Support ========================= diff --git a/drivers/media/video/v4l2-ctrls.c b/drivers/media/video/v4l2-ctrls.c index 8f81efcfcf56..0d1a3d831374 100644 --- a/drivers/media/video/v4l2-ctrls.c +++ b/drivers/media/video/v4l2-ctrls.c @@ -569,7 +569,7 @@ static int user_to_new(struct v4l2_ext_control *c, int ret; u32 size; - ctrl->has_new = 1; + ctrl->is_new = 1; switch (ctrl->type) { case V4L2_CTRL_TYPE_INTEGER64: ctrl->val64 = c->value64; @@ -1280,8 +1280,12 @@ int v4l2_ctrl_handler_setup(struct v4l2_ctrl_handler *hdl) if (ctrl->done) continue; - for (i = 0; i < master->ncontrols; i++) - cur_to_new(master->cluster[i]); + for (i = 0; i < master->ncontrols; i++) { + if (master->cluster[i]) { + cur_to_new(master->cluster[i]); + master->cluster[i]->is_new = 1; + } + } /* Skip button controls and read-only controls. */ if (ctrl->type == V4L2_CTRL_TYPE_BUTTON || @@ -1645,7 +1649,7 @@ static int try_or_set_control_cluster(struct v4l2_ctrl *master, bool set) if (ctrl == NULL) continue; - if (ctrl->has_new) { + if (ctrl->is_new) { /* Double check this: it may have changed since the last check in try_or_set_ext_ctrls(). */ if (set && (ctrl->flags & V4L2_CTRL_FLAG_GRABBED)) @@ -1719,13 +1723,13 @@ static int try_or_set_ext_ctrls(struct v4l2_ctrl_handler *hdl, v4l2_ctrl_lock(ctrl); - /* Reset the 'has_new' flags of the cluster */ + /* Reset the 'is_new' flags of the cluster */ for (j = 0; j < master->ncontrols; j++) if (master->cluster[j]) - master->cluster[j]->has_new = 0; + master->cluster[j]->is_new = 0; /* Copy the new caller-supplied control values. - user_to_new() sets 'has_new' to 1. */ + user_to_new() sets 'is_new' to 1. */ ret = cluster_walk(i, cs, helpers, user_to_new); if (!ret) @@ -1822,13 +1826,13 @@ static int set_ctrl(struct v4l2_ctrl *ctrl, s32 *val) v4l2_ctrl_lock(ctrl); - /* Reset the 'has_new' flags of the cluster */ + /* Reset the 'is_new' flags of the cluster */ for (i = 0; i < master->ncontrols; i++) if (master->cluster[i]) - master->cluster[i]->has_new = 0; + master->cluster[i]->is_new = 0; ctrl->val = *val; - ctrl->has_new = 1; + ctrl->is_new = 1; ret = try_or_set_control_cluster(master, false); if (!ret) ret = try_or_set_control_cluster(master, true); diff --git a/include/media/v4l2-ctrls.h b/include/media/v4l2-ctrls.h index d69ab4aae032..fcc9a0cf8ff1 100644 --- a/include/media/v4l2-ctrls.h +++ b/include/media/v4l2-ctrls.h @@ -53,8 +53,10 @@ struct v4l2_ctrl_ops { * @handler: The handler that owns the control. * @cluster: Point to start of cluster array. * @ncontrols: Number of controls in cluster array. - * @has_new: Internal flag: set when there is a valid new value. * @done: Internal flag: set for each processed control. + * @is_new: Set when the user specified a new value for this control. It + * is also set when called from v4l2_ctrl_handler_setup. Drivers + * should never set this flag. * @is_private: If set, then this control is private to its handler and it * will not be added to any other handlers. Drivers can set * this flag. @@ -97,9 +99,9 @@ struct v4l2_ctrl { struct v4l2_ctrl_handler *handler; struct v4l2_ctrl **cluster; unsigned ncontrols; - unsigned int has_new:1; unsigned int done:1; + unsigned int is_new:1; unsigned int is_private:1; unsigned int is_volatile:1; -- cgit v1.2.3 From 3a6be8d83c82253692f788a65f2d65aa33e37db1 Mon Sep 17 00:00:00 2001 From: Hans Verkuil Date: Sun, 16 Jan 2011 17:09:54 -0300 Subject: [media] DocBook/v4l: fix validation error in dev-rds.xml Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab --- Documentation/DocBook/v4l/dev-rds.xml | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/DocBook/v4l/dev-rds.xml b/Documentation/DocBook/v4l/dev-rds.xml index 360d2737e649..2427f54397e7 100644 --- a/Documentation/DocBook/v4l/dev-rds.xml +++ b/Documentation/DocBook/v4l/dev-rds.xml @@ -75,6 +75,7 @@ as follows:
+ RDS datastructures struct <structname>v4l2_rds_data</structname> @@ -129,10 +130,11 @@ as follows:
Block defines - + - + + V4L2_RDS_BLOCK_MSK -- cgit v1.2.3 From bda50bcd0cc21d9d6dd8c82628f763ab108260a6 Mon Sep 17 00:00:00 2001 From: Hans Verkuil Date: Sun, 16 Jan 2011 17:44:17 -0300 Subject: [media] DocBook/v4l: update V4L2 revision and update copyright years Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab --- Documentation/DocBook/dvb/dvbapi.xml | 2 +- Documentation/DocBook/media.tmpl | 4 ++-- Documentation/DocBook/v4l/v4l2.xml | 3 ++- 3 files changed, 5 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/DocBook/dvb/dvbapi.xml b/Documentation/DocBook/dvb/dvbapi.xml index e3a97fdd62a6..ad8678d48916 100644 --- a/Documentation/DocBook/dvb/dvbapi.xml +++ b/Documentation/DocBook/dvb/dvbapi.xml @@ -28,7 +28,7 @@ Convergence GmbH - 2009-2010 + 2009-2011 Mauro Carvalho Chehab diff --git a/Documentation/DocBook/media.tmpl b/Documentation/DocBook/media.tmpl index f11048d4053f..a99088aae1aa 100644 --- a/Documentation/DocBook/media.tmpl +++ b/Documentation/DocBook/media.tmpl @@ -28,7 +28,7 @@ LINUX MEDIA INFRASTRUCTURE API - 2009-2010 + 2009-2011 LinuxTV Developers @@ -86,7 +86,7 @@ Foundation. A copy of the license is included in the chapter entitled - 2009-2010 + 2009-2011 Mauro Carvalho Chehab diff --git a/Documentation/DocBook/v4l/v4l2.xml b/Documentation/DocBook/v4l/v4l2.xml index 839e93e875ae..9288af96de34 100644 --- a/Documentation/DocBook/v4l/v4l2.xml +++ b/Documentation/DocBook/v4l/v4l2.xml @@ -100,6 +100,7 @@ Remote Controller chapter. 2008 2009 2010 + 2011 Bill Dirks, Michael H. Schimek, Hans Verkuil, Martin Rubli, Andy Walls, Muralidharan Karicheri, Mauro Carvalho Chehab @@ -381,7 +382,7 @@ and discussions on the V4L mailing list. Video for Linux Two API Specification - Revision 2.6.33 + Revision 2.6.38 &sub-common; -- cgit v1.2.3 From 8aeb36e8f6d7eaa9cafc970b700414205743b258 Mon Sep 17 00:00:00 2001 From: Philip Sanderson Date: Thu, 20 Jan 2011 21:37:28 -0600 Subject: lguest: --username and --chroot options I've attached a patch which implements dropping to privileges and chrooting to a directory. Signed-off-by: Rusty Russell --- Documentation/lguest/lguest.c | 50 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) (limited to 'Documentation') diff --git a/Documentation/lguest/lguest.c b/Documentation/lguest/lguest.c index dc73bc54cc4e..f64b85bcd6d4 100644 --- a/Documentation/lguest/lguest.c +++ b/Documentation/lguest/lguest.c @@ -39,6 +39,9 @@ #include #include #include +#include +#include + #include #include #include @@ -1872,6 +1875,8 @@ static struct option opts[] = { { "block", 1, NULL, 'b' }, { "rng", 0, NULL, 'r' }, { "initrd", 1, NULL, 'i' }, + { "username", 1, NULL, 'u' }, + { "chroot", 1, NULL, 'c' }, { NULL }, }; static void usage(void) @@ -1894,6 +1899,12 @@ int main(int argc, char *argv[]) /* If they specify an initrd file to load. */ const char *initrd_name = NULL; + /* Password structure for initgroups/setres[gu]id */ + struct passwd *user_details = NULL; + + /* Directory to chroot to */ + char *chroot_path = NULL; + /* Save the args: we "reboot" by execing ourselves again. */ main_args = argv; @@ -1950,6 +1961,14 @@ int main(int argc, char *argv[]) case 'i': initrd_name = optarg; break; + case 'u': + user_details = getpwnam(optarg); + if (!user_details) + err(1, "getpwnam failed, incorrect username?"); + break; + case 'c': + chroot_path = optarg; + break; default: warnx("Unknown argument %s", argv[optind]); usage(); @@ -2021,6 +2040,37 @@ int main(int argc, char *argv[]) /* If we exit via err(), this kills all the threads, restores tty. */ atexit(cleanup_devices); + /* If requested, chroot to a directory */ + if (chroot_path) { + if (chroot(chroot_path) != 0) + err(1, "chroot(\"%s\") failed", chroot_path); + + if (chdir("/") != 0) + err(1, "chdir(\"/\") failed"); + + verbose("chroot done\n"); + } + + /* If requested, drop privileges */ + if (user_details) { + uid_t u; + gid_t g; + + u = user_details->pw_uid; + g = user_details->pw_gid; + + if (initgroups(user_details->pw_name, g) != 0) + err(1, "initgroups failed"); + + if (setresgid(g, g, g) != 0) + err(1, "setresgid failed"); + + if (setresuid(u, u, u) != 0) + err(1, "setresuid failed"); + + verbose("Dropping privileges completed\n"); + } + /* Finally, run the Guest. This doesn't return. */ run_guest(); } -- cgit v1.2.3 From 5230ff0cccb0611830bb02b097535868df02752a Mon Sep 17 00:00:00 2001 From: Philip Sanderson Date: Thu, 20 Jan 2011 21:37:28 -0600 Subject: lguest: example launcher to use guard pages, drop PROT_EXEC, fix limit logic PROT_EXEC seems to be completely unnecessary (as the lguest binary never executes there), and will allow it to work with SELinux (and more importantly, PaX :-) as they can/do forbid writable and executable mappings. Also, map PROT_NONE guard pages at start and end of guest memory for extra paranoia. I changed the length check to addr + size > guest_limit because >= is wrong (addr of 0, size of getpagesize() with a guest_limit of getpagesize() would false positive). Signed-off-by: Rusty Russell --- Documentation/lguest/lguest.c | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-) (limited to 'Documentation') diff --git a/Documentation/lguest/lguest.c b/Documentation/lguest/lguest.c index f64b85bcd6d4..d9da7e148538 100644 --- a/Documentation/lguest/lguest.c +++ b/Documentation/lguest/lguest.c @@ -301,20 +301,27 @@ static void *map_zeroed_pages(unsigned int num) /* * We use a private mapping (ie. if we write to the page, it will be - * copied). + * copied). We allocate an extra two pages PROT_NONE to act as guard + * pages against read/write attempts that exceed allocated space. */ - addr = mmap(NULL, getpagesize() * num, - PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE, fd, 0); + addr = mmap(NULL, getpagesize() * (num+2), + PROT_NONE, MAP_PRIVATE, fd, 0); + if (addr == MAP_FAILED) err(1, "Mmapping %u pages of /dev/zero", num); + if (mprotect(addr + getpagesize(), getpagesize() * num, + PROT_READ|PROT_WRITE) == -1) + err(1, "mprotect rw %u pages failed", num); + /* * One neat mmap feature is that you can close the fd, and it * stays mapped. */ close(fd); - return addr; + /* Return address after PROT_NONE page */ + return addr + getpagesize(); } /* Get some more pages for a device. */ @@ -346,7 +353,7 @@ static void map_at(int fd, void *addr, unsigned long offset, unsigned long len) * done to it. This allows us to share untouched memory between * Guests. */ - if (mmap(addr, len, PROT_READ|PROT_WRITE|PROT_EXEC, + if (mmap(addr, len, PROT_READ|PROT_WRITE, MAP_FIXED|MAP_PRIVATE, fd, offset) != MAP_FAILED) return; @@ -576,10 +583,10 @@ static void *_check_pointer(unsigned long addr, unsigned int size, unsigned int line) { /* - * We have to separately check addr and addr+size, because size could - * be huge and addr + size might wrap around. + * Check if the requested address and size exceeds the allocated memory, + * or addr + size wraps around. */ - if (addr >= guest_limit || addr + size >= guest_limit) + if ((addr + size) > guest_limit || (addr + size) < addr) errx(1, "%s:%i: Invalid address %#lx", __FILE__, line, addr); /* * We return a pointer for the caller's convenience, now we know it's -- cgit v1.2.3 From 85c0647275b60380e19542d43420184e86418d86 Mon Sep 17 00:00:00 2001 From: Philip Sanderson Date: Thu, 20 Jan 2011 21:37:29 -0600 Subject: lguest: document --rng in example Launcher Rusty Russell wrote: > Ah, it will appear as /dev/hwrng. It's a weirdness of Linux that our actual > hardware number generators are not wired up to /dev/random... Reflected this in the documentation, thanks :-) Signed-off-by: Rusty Russell --- Documentation/lguest/lguest.txt | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'Documentation') diff --git a/Documentation/lguest/lguest.txt b/Documentation/lguest/lguest.txt index 6ccaf8e1a00e..dad99978a6a8 100644 --- a/Documentation/lguest/lguest.txt +++ b/Documentation/lguest/lguest.txt @@ -117,6 +117,11 @@ Running Lguest: for general information on how to get bridging to work. +- Random number generation. Using the --rng option will provide a + /dev/hwrng in the guest that will read from the host's /dev/random. + Use this option in conjunction with rng-tools (see ../hw_random.txt) + to provide entropy to the guest kernel's /dev/random. + There is a helpful mailing list at http://ozlabs.org/mailman/listinfo/lguest Good luck! -- cgit v1.2.3 From 1c77ff22f539ceaa64ea43d6a26d867e84602cb7 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Wed, 19 Jan 2011 19:41:35 +0100 Subject: genirq: Remove __do_IRQ All architectures are finally converted. Remove the cruft. Signed-off-by: Thomas Gleixner Cc: Richard Henderson Cc: Mike Frysinger Cc: David Howells Cc: Tony Luck Cc: Greg Ungerer Cc: Michal Simek Acked-by: David Howells Cc: Kyle McMartin Acked-by: Benjamin Herrenschmidt Cc: Chen Liqin Cc: "David S. Miller" Cc: Chris Metcalf Cc: Jeff Dike --- Documentation/feature-removal-schedule.txt | 8 --- arch/alpha/Kconfig | 3 - arch/blackfin/Kconfig | 3 - arch/frv/Kconfig | 4 -- arch/ia64/Kconfig | 3 - arch/m68knommu/Kconfig | 4 -- arch/microblaze/Kconfig | 3 - arch/mips/Kconfig | 3 - arch/mn10300/Kconfig | 3 - arch/parisc/Kconfig | 4 -- arch/powerpc/Kconfig | 4 -- arch/score/Kconfig | 3 - arch/sparc/Kconfig | 4 -- arch/tile/Kconfig | 3 - arch/um/Kconfig.um | 3 - include/linux/irqdesc.h | 14 ---- kernel/irq/Kconfig | 3 - kernel/irq/handle.c | 111 ----------------------------- 18 files changed, 183 deletions(-) (limited to 'Documentation') diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index 8c594c45b6a1..b959659c5df4 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -357,14 +357,6 @@ Who: Dave Jones , Matthew Garrett ----------------------------- -What: __do_IRQ all in one fits nothing interrupt handler -When: 2.6.32 -Why: __do_IRQ was kept for easy migration to the type flow handlers. - More than two years of migration time is enough. -Who: Thomas Gleixner - ------------------------------ - What: fakephp and associated sysfs files in /sys/bus/pci/slots/ When: 2011 Why: In 2.6.27, the semantics of /sys/bus/pci/slots was redefined to diff --git a/arch/alpha/Kconfig b/arch/alpha/Kconfig index fc95ee1bcf6f..943fe6930f77 100644 --- a/arch/alpha/Kconfig +++ b/arch/alpha/Kconfig @@ -68,9 +68,6 @@ config GENERIC_IOMAP bool default n -config GENERIC_HARDIRQS_NO__DO_IRQ - def_bool y - config GENERIC_HARDIRQS bool default y diff --git a/arch/blackfin/Kconfig b/arch/blackfin/Kconfig index 0a221d48152d..a37b2be23f18 100644 --- a/arch/blackfin/Kconfig +++ b/arch/blackfin/Kconfig @@ -50,9 +50,6 @@ config GENERIC_HARDIRQS config GENERIC_IRQ_PROBE def_bool y -config GENERIC_HARDIRQS_NO__DO_IRQ - def_bool y - config GENERIC_GPIO def_bool y diff --git a/arch/frv/Kconfig b/arch/frv/Kconfig index f6bcb039cd6d..e504edeb3d84 100644 --- a/arch/frv/Kconfig +++ b/arch/frv/Kconfig @@ -33,10 +33,6 @@ config GENERIC_HARDIRQS bool default y -config GENERIC_HARDIRQS_NO__DO_IRQ - bool - default y - config TIME_LOW_RES bool default y diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig index e0f5b6d7f849..be1faf991813 100644 --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@ -684,9 +684,6 @@ source "lib/Kconfig" config GENERIC_HARDIRQS def_bool y -config GENERIC_HARDIRQS_NO__DO_IRQ - def_bool y - config GENERIC_IRQ_PROBE bool default y diff --git a/arch/m68knommu/Kconfig b/arch/m68knommu/Kconfig index 704e7b92334c..7379cb0ce1af 100644 --- a/arch/m68knommu/Kconfig +++ b/arch/m68knommu/Kconfig @@ -52,10 +52,6 @@ config GENERIC_HARDIRQS bool default y -config GENERIC_HARDIRQS_NO__DO_IRQ - bool - default y - config GENERIC_CALIBRATE_DELAY bool default y diff --git a/arch/microblaze/Kconfig b/arch/microblaze/Kconfig index 5f5018a71a3d..5a6378493797 100644 --- a/arch/microblaze/Kconfig +++ b/arch/microblaze/Kconfig @@ -52,9 +52,6 @@ config GENERIC_TIME_VSYSCALL config GENERIC_CLOCKEVENTS def_bool y -config GENERIC_HARDIRQS_NO__DO_IRQ - def_bool y - config GENERIC_GPIO def_bool y diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig index 548e6cc3bc28..f5ecc0566bc2 100644 --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -793,9 +793,6 @@ config SCHED_OMIT_FRAME_POINTER bool default y -config GENERIC_HARDIRQS_NO__DO_IRQ - def_bool y - # # Select some configuration options automatically based on user selections. # diff --git a/arch/mn10300/Kconfig b/arch/mn10300/Kconfig index 8ed41cf2b08d..4638269152f6 100644 --- a/arch/mn10300/Kconfig +++ b/arch/mn10300/Kconfig @@ -34,9 +34,6 @@ config RWSEM_GENERIC_SPINLOCK config RWSEM_XCHGADD_ALGORITHM bool -config GENERIC_HARDIRQS_NO__DO_IRQ - def_bool y - config GENERIC_CALIBRATE_DELAY def_bool y diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig index 0888675c98dd..4b94ac4dc603 100644 --- a/arch/parisc/Kconfig +++ b/arch/parisc/Kconfig @@ -12,7 +12,6 @@ config PARISC select HAVE_IRQ_WORK select HAVE_PERF_EVENTS select GENERIC_ATOMIC64 if !64BIT - select GENERIC_HARDIRQS_NO__DO_IRQ help The PA-RISC microprocessor is designed by Hewlett-Packard and used in many of their workstations & servers (HP9000 700 and 800 series, @@ -79,9 +78,6 @@ config IRQ_PER_CPU bool default y -config GENERIC_HARDIRQS_NO__DO_IRQ - def_bool y - # unless you want to implement ACPI on PA-RISC ... ;-) config PM bool diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 959f38ccb9a7..e0b185dd718a 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -40,10 +40,6 @@ config GENERIC_HARDIRQS bool default y -config GENERIC_HARDIRQS_NO__DO_IRQ - bool - default y - config HAVE_SETUP_PER_CPU_AREA def_bool PPC64 diff --git a/arch/score/Kconfig b/arch/score/Kconfig index 4293fdcb5398..4f159acfbe33 100644 --- a/arch/score/Kconfig +++ b/arch/score/Kconfig @@ -53,9 +53,6 @@ config GENERIC_CLOCKEVENTS config SCHED_NO_NO_OMIT_FRAME_POINTER def_bool y -config GENERIC_HARDIRQS_NO__DO_IRQ - def_bool y - config GENERIC_SYSCALL_TABLE def_bool y diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig index 45d9c87d083a..989bb6415ea3 100644 --- a/arch/sparc/Kconfig +++ b/arch/sparc/Kconfig @@ -107,10 +107,6 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK config NEED_PER_CPU_PAGE_FIRST_CHUNK def_bool y if SPARC64 -config GENERIC_HARDIRQS_NO__DO_IRQ - bool - def_bool y if SPARC64 - config MMU bool default y diff --git a/arch/tile/Kconfig b/arch/tile/Kconfig index 4e8b82bca9e5..c16b98c2435d 100644 --- a/arch/tile/Kconfig +++ b/arch/tile/Kconfig @@ -10,9 +10,6 @@ config GENERIC_CSUM config GENERIC_HARDIRQS def_bool y -config GENERIC_HARDIRQS_NO__DO_IRQ - def_bool y - config GENERIC_IRQ_PROBE def_bool y diff --git a/arch/um/Kconfig.um b/arch/um/Kconfig.um index f8d1d0d47fe6..90a438acbfaf 100644 --- a/arch/um/Kconfig.um +++ b/arch/um/Kconfig.um @@ -120,9 +120,6 @@ config SMP If you don't know what to do, say N. -config GENERIC_HARDIRQS_NO__DO_IRQ - def_bool y - config NR_CPUS int "Maximum number of CPUs (2-32)" range 2 32 diff --git a/include/linux/irqdesc.h b/include/linux/irqdesc.h index 6a64c6fa81af..c1a95b7b58de 100644 --- a/include/linux/irqdesc.h +++ b/include/linux/irqdesc.h @@ -100,13 +100,6 @@ static inline struct irq_desc *move_irq_desc(struct irq_desc *desc, int node) #define get_irq_desc_data(desc) ((desc)->irq_data.handler_data) #define get_irq_desc_msi(desc) ((desc)->irq_data.msi_desc) -/* - * Monolithic do_IRQ implementation. - */ -#ifndef CONFIG_GENERIC_HARDIRQS_NO__DO_IRQ -extern unsigned int __do_IRQ(unsigned int irq); -#endif - /* * Architectures call this to let the generic IRQ layer * handle an interrupt. If the descriptor is attached to an @@ -115,14 +108,7 @@ extern unsigned int __do_IRQ(unsigned int irq); */ static inline void generic_handle_irq_desc(unsigned int irq, struct irq_desc *desc) { -#ifdef CONFIG_GENERIC_HARDIRQS_NO__DO_IRQ desc->handle_irq(irq, desc); -#else - if (likely(desc->handle_irq)) - desc->handle_irq(irq, desc); - else - __do_IRQ(irq); -#endif } static inline void generic_handle_irq(unsigned int irq) diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig index 31d766bf5d2e..8e42fec7686d 100644 --- a/kernel/irq/Kconfig +++ b/kernel/irq/Kconfig @@ -9,9 +9,6 @@ menu "IRQ subsystem" config GENERIC_HARDIRQS def_bool y -config GENERIC_HARDIRQS_NO__DO_IRQ - def_bool y - # Select this to disable the deprecated stuff config GENERIC_HARDIRQS_NO_DEPRECATED def_bool n diff --git a/kernel/irq/handle.c b/kernel/irq/handle.c index e2347eb63306..3540a7190122 100644 --- a/kernel/irq/handle.c +++ b/kernel/irq/handle.c @@ -118,114 +118,3 @@ irqreturn_t handle_IRQ_event(unsigned int irq, struct irqaction *action) return retval; } - -#ifndef CONFIG_GENERIC_HARDIRQS_NO__DO_IRQ - -#ifdef CONFIG_ENABLE_WARN_DEPRECATED -# warning __do_IRQ is deprecated. Please convert to proper flow handlers -#endif - -/** - * __do_IRQ - original all in one highlevel IRQ handler - * @irq: the interrupt number - * - * __do_IRQ handles all normal device IRQ's (the special - * SMP cross-CPU interrupts have their own specific - * handlers). - * - * This is the original x86 implementation which is used for every - * interrupt type. - */ -unsigned int __do_IRQ(unsigned int irq) -{ - struct irq_desc *desc = irq_to_desc(irq); - struct irqaction *action; - unsigned int status; - - kstat_incr_irqs_this_cpu(irq, desc); - - if (CHECK_IRQ_PER_CPU(desc->status)) { - irqreturn_t action_ret; - - /* - * No locking required for CPU-local interrupts: - */ - if (desc->irq_data.chip->ack) - desc->irq_data.chip->ack(irq); - if (likely(!(desc->status & IRQ_DISABLED))) { - action_ret = handle_IRQ_event(irq, desc->action); - if (!noirqdebug) - note_interrupt(irq, desc, action_ret); - } - desc->irq_data.chip->end(irq); - return 1; - } - - raw_spin_lock(&desc->lock); - if (desc->irq_data.chip->ack) - desc->irq_data.chip->ack(irq); - /* - * REPLAY is when Linux resends an IRQ that was dropped earlier - * WAITING is used by probe to mark irqs that are being tested - */ - status = desc->status & ~(IRQ_REPLAY | IRQ_WAITING); - status |= IRQ_PENDING; /* we _want_ to handle it */ - - /* - * If the IRQ is disabled for whatever reason, we cannot - * use the action we have. - */ - action = NULL; - if (likely(!(status & (IRQ_DISABLED | IRQ_INPROGRESS)))) { - action = desc->action; - status &= ~IRQ_PENDING; /* we commit to handling */ - status |= IRQ_INPROGRESS; /* we are handling it */ - } - desc->status = status; - - /* - * If there is no IRQ handler or it was disabled, exit early. - * Since we set PENDING, if another processor is handling - * a different instance of this same irq, the other processor - * will take care of it. - */ - if (unlikely(!action)) - goto out; - - /* - * Edge triggered interrupts need to remember - * pending events. - * This applies to any hw interrupts that allow a second - * instance of the same irq to arrive while we are in do_IRQ - * or in the handler. But the code here only handles the _second_ - * instance of the irq, not the third or fourth. So it is mostly - * useful for irq hardware that does not mask cleanly in an - * SMP environment. - */ - for (;;) { - irqreturn_t action_ret; - - raw_spin_unlock(&desc->lock); - - action_ret = handle_IRQ_event(irq, action); - if (!noirqdebug) - note_interrupt(irq, desc, action_ret); - - raw_spin_lock(&desc->lock); - if (likely(!(desc->status & IRQ_PENDING))) - break; - desc->status &= ~IRQ_PENDING; - } - desc->status &= ~IRQ_INPROGRESS; - -out: - /* - * The ->end() handler has to deal with interrupts which got - * disabled while the handler was running. - */ - desc->irq_data.chip->end(irq); - raw_spin_unlock(&desc->lock); - - return 1; -} -#endif -- cgit v1.2.3 From a1d6906e2d2b4655e248f490ab088c27876a600a Mon Sep 17 00:00:00 2001 From: David Henningsson Date: Fri, 21 Jan 2011 13:33:28 +0100 Subject: ALSA: HDA: Add a new model "asus" for Conexant 5066/205xx BugLink: http://bugs.launchpad.net/bugs/701271 This new model, named "asus", is identical to the "hp_laptop" model, except for the location of the internal mic, which is at pin 0x1a. It is used for Asus K52JU and Lenovo G560. Signed-off-by: David Henningsson Signed-off-by: Takashi Iwai --- Documentation/sound/alsa/HD-Audio-Models.txt | 1 + sound/pci/hda/patch_conexant.c | 24 +++++++++++++++++++++++- 2 files changed, 24 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/sound/alsa/HD-Audio-Models.txt b/Documentation/sound/alsa/HD-Audio-Models.txt index 16ae4300c747..0caf77e59be4 100644 --- a/Documentation/sound/alsa/HD-Audio-Models.txt +++ b/Documentation/sound/alsa/HD-Audio-Models.txt @@ -296,6 +296,7 @@ Conexant 5066 ============= laptop Basic Laptop config (default) hp-laptop HP laptops, e g G60 + asus Asus K52JU, Lenovo G560 dell-laptop Dell laptops dell-vostro Dell Vostro olpc-xo-1_5 OLPC XO 1.5 diff --git a/sound/pci/hda/patch_conexant.c b/sound/pci/hda/patch_conexant.c index 7cd59b9f0e97..19f0daf6497d 100644 --- a/sound/pci/hda/patch_conexant.c +++ b/sound/pci/hda/patch_conexant.c @@ -127,6 +127,7 @@ struct conexant_spec { unsigned int ideapad:1; unsigned int thinkpad:1; unsigned int hp_laptop:1; + unsigned int asus:1; unsigned int ext_mic_present; unsigned int recording; @@ -2312,6 +2313,19 @@ static void cxt5066_ideapad_automic(struct hda_codec *codec) } } + +/* toggle input of built-in digital mic and mic jack appropriately */ +static void cxt5066_asus_automic(struct hda_codec *codec) +{ + unsigned int present; + + present = snd_hda_jack_detect(codec, 0x1b); + snd_printdd("CXT5066: external microphone present=%d\n", present); + snd_hda_codec_write(codec, 0x17, 0, AC_VERB_SET_CONNECT_SEL, + present ? 1 : 0); +} + + /* toggle input of built-in digital mic and mic jack appropriately */ static void cxt5066_hp_laptop_automic(struct hda_codec *codec) { @@ -2400,6 +2414,8 @@ static void cxt5066_automic(struct hda_codec *codec) cxt5066_thinkpad_automic(codec); else if (spec->hp_laptop) cxt5066_hp_laptop_automic(codec); + else if (spec->asus) + cxt5066_asus_automic(codec); } /* unsolicited event for jack sensing */ @@ -3045,6 +3061,7 @@ enum { CXT5066_DELL_VOSTRO, /* Dell Vostro 1015i */ CXT5066_IDEAPAD, /* Lenovo IdeaPad U150 */ CXT5066_THINKPAD, /* Lenovo ThinkPad T410s, others? */ + CXT5066_ASUS, /* Asus K52JU, Lenovo G560 - Int mic at 0x1a and Ext mic at 0x1b */ CXT5066_HP_LAPTOP, /* HP Laptop */ CXT5066_MODELS }; @@ -3056,6 +3073,7 @@ static const char * const cxt5066_models[CXT5066_MODELS] = { [CXT5066_DELL_VOSTRO] = "dell-vostro", [CXT5066_IDEAPAD] = "ideapad", [CXT5066_THINKPAD] = "thinkpad", + [CXT5066_ASUS] = "asus", [CXT5066_HP_LAPTOP] = "hp-laptop", }; @@ -3068,6 +3086,7 @@ static struct snd_pci_quirk cxt5066_cfg_tbl[] = { SND_PCI_QUIRK(0x1028, 0x0408, "Dell Inspiron One 19T", CXT5066_IDEAPAD), SND_PCI_QUIRK(0x103c, 0x360b, "HP G60", CXT5066_HP_LAPTOP), SND_PCI_QUIRK(0x1043, 0x13f3, "Asus A52J", CXT5066_HP_LAPTOP), + SND_PCI_QUIRK(0x1043, 0x1643, "Asus K52JU", CXT5066_ASUS), SND_PCI_QUIRK(0x1179, 0xff1e, "Toshiba Satellite C650D", CXT5066_IDEAPAD), SND_PCI_QUIRK(0x1179, 0xff50, "Toshiba Satellite P500-PSPGSC-01800T", CXT5066_OLPC_XO_1_5), SND_PCI_QUIRK(0x1179, 0xffe0, "Toshiba Satellite Pro T130-15F", CXT5066_OLPC_XO_1_5), @@ -3077,6 +3096,7 @@ static struct snd_pci_quirk cxt5066_cfg_tbl[] = { SND_PCI_QUIRK(0x17aa, 0x20f2, "Lenovo T400s", CXT5066_THINKPAD), SND_PCI_QUIRK(0x17aa, 0x21c5, "Thinkpad Edge 13", CXT5066_THINKPAD), SND_PCI_QUIRK(0x17aa, 0x215e, "Lenovo Thinkpad", CXT5066_THINKPAD), + SND_PCI_QUIRK(0x17aa, 0x38af, "Lenovo G560", CXT5066_ASUS), SND_PCI_QUIRK_VENDOR(0x17aa, "Lenovo", CXT5066_IDEAPAD), /* Fallback for Lenovos without dock mic */ {} }; @@ -3132,13 +3152,15 @@ static int patch_cxt5066(struct hda_codec *codec) spec->num_init_verbs++; spec->dell_automute = 1; break; + case CXT5066_ASUS: case CXT5066_HP_LAPTOP: codec->patch_ops.init = cxt5066_init; codec->patch_ops.unsol_event = cxt5066_unsol_event; spec->init_verbs[spec->num_init_verbs] = cxt5066_init_verbs_hp_laptop; spec->num_init_verbs++; - spec->hp_laptop = 1; + spec->hp_laptop = board_config == CXT5066_HP_LAPTOP; + spec->asus = board_config == CXT5066_ASUS; spec->mixers[spec->num_mixers++] = cxt5066_mixer_master; spec->mixers[spec->num_mixers++] = cxt5066_mixers; /* no S/PDIF out */ -- cgit v1.2.3 From fcf285643433c6f6663182e9f4b3c9169fa22e3e Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Sat, 22 Jan 2011 19:50:03 -0800 Subject: docbook: fix broken serial to tty/serial movement Fix move of drivers/serial/ to drivers/tty/, where it broke one of the docbook files: docproc: drivers/serial/serial_core.c: No such file or directory Signed-off-by: Randy Dunlap Signed-off-by: Linus Torvalds --- Documentation/DocBook/device-drivers.tmpl | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl index 35447e081736..36f63d4a0a06 100644 --- a/Documentation/DocBook/device-drivers.tmpl +++ b/Documentation/DocBook/device-drivers.tmpl @@ -217,8 +217,8 @@ X!Isound/sound_firmware.c 16x50 UART Driver !Iinclude/linux/serial_core.h -!Edrivers/serial/serial_core.c -!Edrivers/serial/8250.c +!Edrivers/tty/serial/serial_core.c +!Edrivers/tty/serial/8250.c -- cgit v1.2.3 From 3a5655a5b545e9647c3437473ee3d815fe1b9050 Mon Sep 17 00:00:00 2001 From: Marc Kleine-Budde Date: Mon, 10 Jan 2011 20:44:22 +0100 Subject: can: at91_can: make can_id of mailbox 0 configurable Due to a chip bug (errata 50.2.6.3 & 50.3.5.3 in "AT91SAM9263 Preliminary 6249H-ATARM-27-Jul-09") the contents of mailbox 0 may be send under certain conditions (even if disabled or in rx mode). The workaround in the errata suggests not to use the mailbox and load it with an unused identifier. This patch implements the second part of the workaround. A sysfs entry "mb0_id" is introduced. While the interface is down it can be used to configure the can_id of mailbox 0. The default value id 0x7ff. In order to use an extended can_id add the CAN_EFF_FLAG (0x80000000U) to the can_id. Example: - standard id 0x7ff: echo 0x7ff > /sys/class/net/can0/mb0_id - extended id 0x1fffffff: echo 0x9fffffff > /sys/class/net/can0/mb0_id Signed-off-by: Marc Kleine-Budde Acked-by: Wolfgang Grandegger Acked-by: Kurt Van Dijck For the Documentation-part: Acked-by: Wolfram Sang --- Documentation/ABI/testing/sysfs-platform-at91 | 25 ++++++++ drivers/net/can/at91_can.c | 90 ++++++++++++++++++++++++--- 2 files changed, 108 insertions(+), 7 deletions(-) create mode 100644 Documentation/ABI/testing/sysfs-platform-at91 (limited to 'Documentation') diff --git a/Documentation/ABI/testing/sysfs-platform-at91 b/Documentation/ABI/testing/sysfs-platform-at91 new file mode 100644 index 000000000000..4cc6a865ae66 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-platform-at91 @@ -0,0 +1,25 @@ +What: /sys/devices/platform/at91_can/net//mb0_id +Date: January 2011 +KernelVersion: 2.6.38 +Contact: Marc Kleine-Budde +Description: + Value representing the can_id of mailbox 0. + + Default: 0x7ff (standard frame) + + Due to a chip bug (errata 50.2.6.3 & 50.3.5.3 in + "AT91SAM9263 Preliminary 6249H-ATARM-27-Jul-09") the + contents of mailbox 0 may be send under certain + conditions (even if disabled or in rx mode). + + The workaround in the errata suggests not to use the + mailbox and load it with an unused identifier. + + In order to use an extended can_id add the + CAN_EFF_FLAG (0x80000000U) to the can_id. Example: + + - standard id 0x7ff: + echo 0x7ff > /sys/class/net/can0/mb0_id + + - extended id 0x1fffffff: + echo 0x9fffffff > /sys/class/net/can0/mb0_id diff --git a/drivers/net/can/at91_can.c b/drivers/net/can/at91_can.c index 16e45a51cbb3..2532b9631538 100644 --- a/drivers/net/can/at91_can.c +++ b/drivers/net/can/at91_can.c @@ -30,6 +30,7 @@ #include #include #include +#include #include #include #include @@ -169,6 +170,8 @@ struct at91_priv { struct clk *clk; struct at91_can_data *pdata; + + canid_t mb0_id; }; static struct can_bittiming_const at91_bittiming_const = { @@ -221,6 +224,18 @@ static inline void set_mb_mode(const struct at91_priv *priv, unsigned int mb, set_mb_mode_prio(priv, mb, mode, 0); } +static inline u32 at91_can_id_to_reg_mid(canid_t can_id) +{ + u32 reg_mid; + + if (can_id & CAN_EFF_FLAG) + reg_mid = (can_id & CAN_EFF_MASK) | AT91_MID_MIDE; + else + reg_mid = (can_id & CAN_SFF_MASK) << 18; + + return reg_mid; +} + /* * Swtich transceiver on or off */ @@ -234,6 +249,7 @@ static void at91_setup_mailboxes(struct net_device *dev) { struct at91_priv *priv = netdev_priv(dev); unsigned int i; + u32 reg_mid; /* * Due to a chip bug (errata 50.2.6.3 & 50.3.5.3) the first @@ -242,8 +258,13 @@ static void at91_setup_mailboxes(struct net_device *dev) * overwrite option. The overwrite flag indicates a FIFO * overflow. */ - for (i = 0; i < AT91_MB_RX_FIRST; i++) + reg_mid = at91_can_id_to_reg_mid(priv->mb0_id); + for (i = 0; i < AT91_MB_RX_FIRST; i++) { set_mb_mode(priv, i, AT91_MB_MODE_DISABLED); + at91_write(priv, AT91_MID(i), reg_mid); + at91_write(priv, AT91_MCR(i), 0x0); /* clear dlc */ + } + for (i = AT91_MB_RX_FIRST; i < AT91_MB_RX_LAST; i++) set_mb_mode(priv, i, AT91_MB_MODE_RX); set_mb_mode(priv, AT91_MB_RX_LAST, AT91_MB_MODE_RX_OVRWR); @@ -378,12 +399,7 @@ static netdev_tx_t at91_start_xmit(struct sk_buff *skb, struct net_device *dev) netdev_err(dev, "BUG! TX buffer full when queue awake!\n"); return NETDEV_TX_BUSY; } - - if (cf->can_id & CAN_EFF_FLAG) - reg_mid = (cf->can_id & CAN_EFF_MASK) | AT91_MID_MIDE; - else - reg_mid = (cf->can_id & CAN_SFF_MASK) << 18; - + reg_mid = at91_can_id_to_reg_mid(cf->can_id); reg_mcr = ((cf->can_id & CAN_RTR_FLAG) ? AT91_MCR_MRTR : 0) | (cf->can_dlc << 16) | AT91_MCR_MTCR; @@ -1047,6 +1063,64 @@ static const struct net_device_ops at91_netdev_ops = { .ndo_start_xmit = at91_start_xmit, }; +static ssize_t at91_sysfs_show_mb0_id(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct at91_priv *priv = netdev_priv(to_net_dev(dev)); + + if (priv->mb0_id & CAN_EFF_FLAG) + return snprintf(buf, PAGE_SIZE, "0x%08x\n", priv->mb0_id); + else + return snprintf(buf, PAGE_SIZE, "0x%03x\n", priv->mb0_id); +} + +static ssize_t at91_sysfs_set_mb0_id(struct device *dev, + struct device_attribute *attr, const char *buf, size_t count) +{ + struct net_device *ndev = to_net_dev(dev); + struct at91_priv *priv = netdev_priv(ndev); + unsigned long can_id; + ssize_t ret; + int err; + + rtnl_lock(); + + if (ndev->flags & IFF_UP) { + ret = -EBUSY; + goto out; + } + + err = strict_strtoul(buf, 0, &can_id); + if (err) { + ret = err; + goto out; + } + + if (can_id & CAN_EFF_FLAG) + can_id &= CAN_EFF_MASK | CAN_EFF_FLAG; + else + can_id &= CAN_SFF_MASK; + + priv->mb0_id = can_id; + ret = count; + + out: + rtnl_unlock(); + return ret; +} + +static DEVICE_ATTR(mb0_id, S_IWUGO | S_IRUGO, + at91_sysfs_show_mb0_id, at91_sysfs_set_mb0_id); + +static struct attribute *at91_sysfs_attrs[] = { + &dev_attr_mb0_id.attr, + NULL, +}; + +static struct attribute_group at91_sysfs_attr_group = { + .attrs = at91_sysfs_attrs, +}; + static int __devinit at91_can_probe(struct platform_device *pdev) { struct net_device *dev; @@ -1092,6 +1166,7 @@ static int __devinit at91_can_probe(struct platform_device *pdev) dev->netdev_ops = &at91_netdev_ops; dev->irq = irq; dev->flags |= IFF_ECHO; + dev->sysfs_groups[0] = &at91_sysfs_attr_group; priv = netdev_priv(dev); priv->can.clock.freq = clk_get_rate(clk); @@ -1103,6 +1178,7 @@ static int __devinit at91_can_probe(struct platform_device *pdev) priv->dev = dev; priv->clk = clk; priv->pdata = pdev->dev.platform_data; + priv->mb0_id = 0x7ff; netif_napi_add(dev, &priv->napi, at91_poll, AT91_NAPI_WEIGHT); -- cgit v1.2.3 From de221bd5eb5e754806fcc39c40bb12b96515d9c5 Mon Sep 17 00:00:00 2001 From: Nicolas de Pesloüan Date: Mon, 24 Jan 2011 13:21:37 +0000 Subject: bonding: update documentation - alternate configuration. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The bonding documentation used to provide configuration details and examples for initscripts and sysconfig only. This patch describe the third possible configuration: /etc/network/interfaces. Signed-off-by: Nicolas de Pesloüan Signed-off-by: David S. Miller --- Documentation/networking/bonding.txt | 83 ++++++++++++++++++++++++++++++------ 1 file changed, 71 insertions(+), 12 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt index 5dc638791d97..25d2f4141d27 100644 --- a/Documentation/networking/bonding.txt +++ b/Documentation/networking/bonding.txt @@ -49,7 +49,8 @@ Table of Contents 3.3 Configuring Bonding Manually with Ifenslave 3.3.1 Configuring Multiple Bonds Manually 3.4 Configuring Bonding Manually via Sysfs -3.5 Overriding Configuration for Special Cases +3.5 Configuration with Interfaces Support +3.6 Overriding Configuration for Special Cases 4. Querying Bonding Configuration 4.1 Bonding Configuration @@ -161,8 +162,8 @@ onwards) do not have /usr/include/linux symbolically linked to the default kernel source include directory. SECOND IMPORTANT NOTE: - If you plan to configure bonding using sysfs, you do not need -to use ifenslave. + If you plan to configure bonding using sysfs or using the +/etc/network/interfaces file, you do not need to use ifenslave. 2. Bonding Driver Options ========================= @@ -779,22 +780,26 @@ resend_igmp You can configure bonding using either your distro's network initialization scripts, or manually using either ifenslave or the -sysfs interface. Distros generally use one of two packages for the -network initialization scripts: initscripts or sysconfig. Recent -versions of these packages have support for bonding, while older +sysfs interface. Distros generally use one of three packages for the +network initialization scripts: initscripts, sysconfig or interfaces. +Recent versions of these packages have support for bonding, while older versions do not. We will first describe the options for configuring bonding for -distros using versions of initscripts and sysconfig with full or -partial support for bonding, then provide information on enabling +distros using versions of initscripts, sysconfig and interfaces with full +or partial support for bonding, then provide information on enabling bonding without support from the network initialization scripts (i.e., older versions of initscripts or sysconfig). - If you're unsure whether your distro uses sysconfig or -initscripts, or don't know if it's new enough, have no fear. + If you're unsure whether your distro uses sysconfig, +initscripts or interfaces, or don't know if it's new enough, have no fear. Determining this is fairly straightforward. - First, issue the command: + First, look for a file called interfaces in /etc/network directory. +If this file is present in your system, then your system use interfaces. See +Configuration with Interfaces Support. + + Else, issue the command: $ rpm -qf /sbin/ifup @@ -1327,8 +1332,62 @@ echo 2000 > /sys/class/net/bond1/bonding/arp_interval echo +eth2 > /sys/class/net/bond1/bonding/slaves echo +eth3 > /sys/class/net/bond1/bonding/slaves -3.5 Overriding Configuration for Special Cases +3.5 Configuration with Interfaces Support +----------------------------------------- + + This section applies to distros which use /etc/network/interfaces file +to describe network interface configuration, most notably Debian and it's +derivatives. + + The ifup and ifdown commands on Debian don't support bonding out of +the box. The ifenslave-2.6 package should be installed to provide bonding +support. Once installed, this package will provide bond-* options to be used +into /etc/network/interfaces. + + Note that ifenslave-2.6 package will load the bonding module and use +the ifenslave command when appropriate. + +Example Configurations +---------------------- + +In /etc/network/interfaces, the following stanza will configure bond0, in +active-backup mode, with eth0 and eth1 as slaves. + +auto bond0 +iface bond0 inet dhcp + bond-slaves eth0 eth1 + bond-mode active-backup + bond-miimon 100 + bond-primary eth0 eth1 + +If the above configuration doesn't work, you might have a system using +upstart for system startup. This is most notably true for recent +Ubuntu versions. The following stanza in /etc/network/interfaces will +produce the same result on those systems. + +auto bond0 +iface bond0 inet dhcp + bond-slaves none + bond-mode active-backup + bond-miimon 100 + +auto eth0 +iface eth0 inet manual + bond-master bond0 + bond-primary eth0 eth1 + +auto eth1 +iface eth1 inet manual + bond-master bond0 + bond-primary eth0 eth1 + +For a full list of bond-* supported options in /etc/network/interfaces and some +more advanced examples tailored to you particular distros, see the files in +/usr/share/doc/ifenslave-2.6. + +3.6 Overriding Configuration for Special Cases ---------------------------------------------- + When using the bonding driver, the physical port which transmits a frame is typically selected by the bonding driver, and is not relevant to the user or system administrator. The output port is simply selected using the policies of -- cgit v1.2.3 From 9cfe268ec4f76b7636687ccbe99be75e00085f32 Mon Sep 17 00:00:00 2001 From: Alan Cox Date: Tue, 25 Jan 2011 14:18:38 +0000 Subject: Documentation: Fix kernel parameter ordering A B C D E ... Signed-off-by: Alan Cox Signed-off-by: Linus Torvalds --- Documentation/kernel-parameters.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index b72e071a3e5b..89835a4766a6 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -43,11 +43,11 @@ parameter is applicable: AVR32 AVR32 architecture is enabled. AX25 Appropriate AX.25 support is enabled. BLACKFIN Blackfin architecture is enabled. + DRM Direct Rendering Management support is enabled. + DYNAMIC_DEBUG Build in debug messages and enable them at runtime EDD BIOS Enhanced Disk Drive Services (EDD) is enabled EFI EFI Partitioning (GPT) is enabled EIDE EIDE/ATAPI support is enabled. - DRM Direct Rendering Management support is enabled. - DYNAMIC_DEBUG Build in debug messages and enable them at runtime FB The frame buffer device is enabled. GCOV GCOV profiling is enabled. HW Appropriate hardware is enabled. -- cgit v1.2.3 From af5eb745efe97d91d2cbe793029838b3311c15da Mon Sep 17 00:00:00 2001 From: Anton Altaparmakov Date: Fri, 28 Jan 2011 20:45:28 +0000 Subject: NTFS: Fix invalid pointer dereference in ntfs_mft_record_alloc(). In ntfs_mft_record_alloc() when mapping the new extent mft record with map_extent_mft_record() we overwrite @m with the return value and on error, we then try to use the old @m but that is no longer there as @m now contains an error code instead so we crash when dereferencing the error code as if it were a pointer. The simple fix is to use a temporary variable to store the return value thus preserving the original @m for later use. This is a backport from the commercial Tuxera-NTFS driver and is well tested... Thanks go to Julia Lawall for pointing this out (whilst I had fixed it in the commercial driver I had failed to fix it in the Linux kernel). Signed-off-by: Anton Altaparmakov Signed-off-by: Linus Torvalds --- Documentation/filesystems/ntfs.txt | 2 ++ fs/ntfs/mft.c | 11 +++++++---- 2 files changed, 9 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/ntfs.txt b/Documentation/filesystems/ntfs.txt index 6ef8cf3bc9a3..933bc66ccff1 100644 --- a/Documentation/filesystems/ntfs.txt +++ b/Documentation/filesystems/ntfs.txt @@ -460,6 +460,8 @@ Note, a technical ChangeLog aimed at kernel hackers is in fs/ntfs/ChangeLog. 2.1.30: - Fix writev() (it kept writing the first segment over and over again instead of moving onto subsequent segments). + - Fix crash in ntfs_mft_record_alloc() when mapping the new extent mft + record failed. 2.1.29: - Fix a deadlock when mounting read-write. 2.1.28: diff --git a/fs/ntfs/mft.c b/fs/ntfs/mft.c index b572b6727181..326e7475a22a 100644 --- a/fs/ntfs/mft.c +++ b/fs/ntfs/mft.c @@ -1,7 +1,7 @@ /** * mft.c - NTFS kernel mft record operations. Part of the Linux-NTFS project. * - * Copyright (c) 2001-2006 Anton Altaparmakov + * Copyright (c) 2001-2011 Anton Altaparmakov and Tuxera Inc. * Copyright (c) 2002 Richard Russon * * This program/include file is free software; you can redistribute it and/or @@ -2576,6 +2576,8 @@ mft_rec_already_initialized: flush_dcache_page(page); SetPageUptodate(page); if (base_ni) { + MFT_RECORD *m_tmp; + /* * Setup the base mft record in the extent mft record. This * completes initialization of the allocated extent mft record @@ -2588,11 +2590,11 @@ mft_rec_already_initialized: * attach it to the base inode @base_ni and map, pin, and lock * its, i.e. the allocated, mft record. */ - m = map_extent_mft_record(base_ni, bit, &ni); - if (IS_ERR(m)) { + m_tmp = map_extent_mft_record(base_ni, bit, &ni); + if (IS_ERR(m_tmp)) { ntfs_error(vol->sb, "Failed to map allocated extent " "mft record 0x%llx.", (long long)bit); - err = PTR_ERR(m); + err = PTR_ERR(m_tmp); /* Set the mft record itself not in use. */ m->flags &= cpu_to_le16( ~le16_to_cpu(MFT_RECORD_IN_USE)); @@ -2603,6 +2605,7 @@ mft_rec_already_initialized: ntfs_unmap_page(page); goto undo_mftbmp_alloc; } + BUG_ON(m != m_tmp); /* * Make sure the allocated mft record is written out to disk. * No need to set the inode dirty because the caller is going -- cgit v1.2.3 From d524dac9279b6a41ffdf7ff7958c577f2e387db6 Mon Sep 17 00:00:00 2001 From: Grant Likely Date: Wed, 26 Jan 2011 10:10:40 -0700 Subject: dt: Move device tree documentation out of powerpc directory The device tree is used by more than just PowerPC. Make the documentation directory available to all. v2: reorganized files while moving to create arch and driver specific directories. Signed-off-by: Grant Likely Acked-by: Josh Boyer --- Documentation/devicetree/bindings/ata/fsl-sata.txt | 29 + Documentation/devicetree/bindings/eeprom.txt | 28 + .../devicetree/bindings/gpio/8xxx_gpio.txt | 60 + Documentation/devicetree/bindings/gpio/gpio.txt | 50 + Documentation/devicetree/bindings/gpio/led.txt | 58 + Documentation/devicetree/bindings/i2c/fsl-i2c.txt | 64 + Documentation/devicetree/bindings/marvell.txt | 521 +++++++ .../devicetree/bindings/mmc/fsl-esdhc.txt | 29 + .../devicetree/bindings/mmc/mmc-spi-slot.txt | 23 + .../devicetree/bindings/mtd/fsl-upm-nand.txt | 63 + .../devicetree/bindings/mtd/mtd-physmap.txt | 90 ++ .../devicetree/bindings/net/can/mpc5xxx-mscan.txt | 53 + .../devicetree/bindings/net/can/sja1000.txt | 53 + .../devicetree/bindings/net/fsl-tsec-phy.txt | 76 + .../devicetree/bindings/net/mdio-gpio.txt | 19 + Documentation/devicetree/bindings/net/phy.txt | 25 + .../devicetree/bindings/pci/83xx-512x-pci.txt | 40 + .../devicetree/bindings/powerpc/4xx/cpm.txt | 52 + .../devicetree/bindings/powerpc/4xx/emac.txt | 148 ++ .../devicetree/bindings/powerpc/4xx/ndfc.txt | 39 + .../bindings/powerpc/4xx/ppc440spe-adma.txt | 93 ++ .../devicetree/bindings/powerpc/4xx/reboot.txt | 18 + .../devicetree/bindings/powerpc/fsl/board.txt | 63 + .../devicetree/bindings/powerpc/fsl/cpm_qe/cpm.txt | 67 + .../bindings/powerpc/fsl/cpm_qe/cpm/brg.txt | 21 + .../bindings/powerpc/fsl/cpm_qe/cpm/i2c.txt | 41 + .../bindings/powerpc/fsl/cpm_qe/cpm/pic.txt | 18 + .../bindings/powerpc/fsl/cpm_qe/cpm/usb.txt | 15 + .../bindings/powerpc/fsl/cpm_qe/gpio.txt | 38 + .../bindings/powerpc/fsl/cpm_qe/network.txt | 45 + .../devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt | 115 ++ .../bindings/powerpc/fsl/cpm_qe/qe/firmware.txt | 24 + .../bindings/powerpc/fsl/cpm_qe/qe/par_io.txt | 51 + .../bindings/powerpc/fsl/cpm_qe/qe/pincfg.txt | 60 + .../bindings/powerpc/fsl/cpm_qe/qe/ucc.txt | 70 + .../bindings/powerpc/fsl/cpm_qe/qe/usb.txt | 37 + .../bindings/powerpc/fsl/cpm_qe/serial.txt | 32 + .../devicetree/bindings/powerpc/fsl/diu.txt | 34 + .../devicetree/bindings/powerpc/fsl/dma.txt | 144 ++ .../devicetree/bindings/powerpc/fsl/ecm.txt | 64 + .../devicetree/bindings/powerpc/fsl/gtm.txt | 31 + .../devicetree/bindings/powerpc/fsl/guts.txt | 25 + .../devicetree/bindings/powerpc/fsl/lbc.txt | 35 + .../devicetree/bindings/powerpc/fsl/mcm.txt | 64 + .../bindings/powerpc/fsl/mcu-mpc8349emitx.txt | 17 + .../bindings/powerpc/fsl/mpc5121-psc.txt | 70 + .../devicetree/bindings/powerpc/fsl/mpc5200.txt | 198 +++ .../devicetree/bindings/powerpc/fsl/mpic.txt | 42 + .../devicetree/bindings/powerpc/fsl/msi-pic.txt | 36 + .../devicetree/bindings/powerpc/fsl/pmc.txt | 63 + .../devicetree/bindings/powerpc/fsl/sec.txt | 68 + .../devicetree/bindings/powerpc/fsl/ssi.txt | 73 + .../bindings/powerpc/nintendo/gamecube.txt | 109 ++ .../devicetree/bindings/powerpc/nintendo/wii.txt | 184 +++ Documentation/devicetree/bindings/spi/fsl-spi.txt | 53 + Documentation/devicetree/bindings/spi/spi-bus.txt | 57 + Documentation/devicetree/bindings/usb/fsl-usb.txt | 81 ++ Documentation/devicetree/bindings/usb/usb-ehci.txt | 25 + Documentation/devicetree/bindings/xilinx.txt | 306 +++++ Documentation/devicetree/booting-without-of.txt | 1447 ++++++++++++++++++++ Documentation/powerpc/booting-without-of.txt | 1447 -------------------- Documentation/powerpc/dts-bindings/4xx/cpm.txt | 52 - Documentation/powerpc/dts-bindings/4xx/emac.txt | 148 -- Documentation/powerpc/dts-bindings/4xx/ndfc.txt | 39 - .../powerpc/dts-bindings/4xx/ppc440spe-adma.txt | 93 -- Documentation/powerpc/dts-bindings/4xx/reboot.txt | 18 - Documentation/powerpc/dts-bindings/can/sja1000.txt | 53 - Documentation/powerpc/dts-bindings/ecm.txt | 64 - Documentation/powerpc/dts-bindings/eeprom.txt | 28 - .../powerpc/dts-bindings/fsl/83xx-512x-pci.txt | 40 - .../powerpc/dts-bindings/fsl/8xxx_gpio.txt | 60 - Documentation/powerpc/dts-bindings/fsl/board.txt | 63 - Documentation/powerpc/dts-bindings/fsl/can.txt | 53 - .../powerpc/dts-bindings/fsl/cpm_qe/cpm.txt | 67 - .../powerpc/dts-bindings/fsl/cpm_qe/cpm/brg.txt | 21 - .../powerpc/dts-bindings/fsl/cpm_qe/cpm/i2c.txt | 41 - .../powerpc/dts-bindings/fsl/cpm_qe/cpm/pic.txt | 18 - .../powerpc/dts-bindings/fsl/cpm_qe/cpm/usb.txt | 15 - .../powerpc/dts-bindings/fsl/cpm_qe/gpio.txt | 38 - .../powerpc/dts-bindings/fsl/cpm_qe/network.txt | 45 - .../powerpc/dts-bindings/fsl/cpm_qe/qe.txt | 115 -- .../dts-bindings/fsl/cpm_qe/qe/firmware.txt | 24 - .../powerpc/dts-bindings/fsl/cpm_qe/qe/par_io.txt | 51 - .../powerpc/dts-bindings/fsl/cpm_qe/qe/pincfg.txt | 60 - .../powerpc/dts-bindings/fsl/cpm_qe/qe/ucc.txt | 70 - .../powerpc/dts-bindings/fsl/cpm_qe/qe/usb.txt | 37 - .../powerpc/dts-bindings/fsl/cpm_qe/serial.txt | 32 - Documentation/powerpc/dts-bindings/fsl/diu.txt | 34 - Documentation/powerpc/dts-bindings/fsl/dma.txt | 144 -- Documentation/powerpc/dts-bindings/fsl/esdhc.txt | 29 - Documentation/powerpc/dts-bindings/fsl/gtm.txt | 31 - Documentation/powerpc/dts-bindings/fsl/guts.txt | 25 - Documentation/powerpc/dts-bindings/fsl/i2c.txt | 64 - Documentation/powerpc/dts-bindings/fsl/lbc.txt | 35 - Documentation/powerpc/dts-bindings/fsl/mcm.txt | 64 - .../powerpc/dts-bindings/fsl/mcu-mpc8349emitx.txt | 17 - .../powerpc/dts-bindings/fsl/mpc5121-psc.txt | 70 - Documentation/powerpc/dts-bindings/fsl/mpc5200.txt | 198 --- Documentation/powerpc/dts-bindings/fsl/mpic.txt | 42 - Documentation/powerpc/dts-bindings/fsl/msi-pic.txt | 36 - Documentation/powerpc/dts-bindings/fsl/pmc.txt | 63 - Documentation/powerpc/dts-bindings/fsl/sata.txt | 29 - Documentation/powerpc/dts-bindings/fsl/sec.txt | 68 - Documentation/powerpc/dts-bindings/fsl/spi.txt | 53 - Documentation/powerpc/dts-bindings/fsl/ssi.txt | 73 - Documentation/powerpc/dts-bindings/fsl/tsec.txt | 76 - .../powerpc/dts-bindings/fsl/upm-nand.txt | 63 - Documentation/powerpc/dts-bindings/fsl/usb.txt | 81 -- Documentation/powerpc/dts-bindings/gpio/gpio.txt | 50 - Documentation/powerpc/dts-bindings/gpio/led.txt | 58 - Documentation/powerpc/dts-bindings/gpio/mdio.txt | 19 - Documentation/powerpc/dts-bindings/marvell.txt | 521 ------- .../powerpc/dts-bindings/mmc-spi-slot.txt | 23 - Documentation/powerpc/dts-bindings/mtd-physmap.txt | 90 -- .../powerpc/dts-bindings/nintendo/gamecube.txt | 109 -- .../powerpc/dts-bindings/nintendo/wii.txt | 184 --- Documentation/powerpc/dts-bindings/phy.txt | 25 - Documentation/powerpc/dts-bindings/spi-bus.txt | 57 - Documentation/powerpc/dts-bindings/usb-ehci.txt | 25 - Documentation/powerpc/dts-bindings/xilinx.txt | 306 ----- 120 files changed, 5554 insertions(+), 5554 deletions(-) create mode 100644 Documentation/devicetree/bindings/ata/fsl-sata.txt create mode 100644 Documentation/devicetree/bindings/eeprom.txt create mode 100644 Documentation/devicetree/bindings/gpio/8xxx_gpio.txt create mode 100644 Documentation/devicetree/bindings/gpio/gpio.txt create mode 100644 Documentation/devicetree/bindings/gpio/led.txt create mode 100644 Documentation/devicetree/bindings/i2c/fsl-i2c.txt create mode 100644 Documentation/devicetree/bindings/marvell.txt create mode 100644 Documentation/devicetree/bindings/mmc/fsl-esdhc.txt create mode 100644 Documentation/devicetree/bindings/mmc/mmc-spi-slot.txt create mode 100644 Documentation/devicetree/bindings/mtd/fsl-upm-nand.txt create mode 100644 Documentation/devicetree/bindings/mtd/mtd-physmap.txt create mode 100644 Documentation/devicetree/bindings/net/can/mpc5xxx-mscan.txt create mode 100644 Documentation/devicetree/bindings/net/can/sja1000.txt create mode 100644 Documentation/devicetree/bindings/net/fsl-tsec-phy.txt create mode 100644 Documentation/devicetree/bindings/net/mdio-gpio.txt create mode 100644 Documentation/devicetree/bindings/net/phy.txt create mode 100644 Documentation/devicetree/bindings/pci/83xx-512x-pci.txt create mode 100644 Documentation/devicetree/bindings/powerpc/4xx/cpm.txt create mode 100644 Documentation/devicetree/bindings/powerpc/4xx/emac.txt create mode 100644 Documentation/devicetree/bindings/powerpc/4xx/ndfc.txt create mode 100644 Documentation/devicetree/bindings/powerpc/4xx/ppc440spe-adma.txt create mode 100644 Documentation/devicetree/bindings/powerpc/4xx/reboot.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/board.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/brg.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/i2c.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/pic.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/usb.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/gpio.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/network.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe/firmware.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe/par_io.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe/pincfg.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe/ucc.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe/usb.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/serial.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/diu.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/dma.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/ecm.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/gtm.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/guts.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/lbc.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/mcm.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/mcu-mpc8349emitx.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/mpc5121-psc.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/mpc5200.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/mpic.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/msi-pic.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/pmc.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/sec.txt create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/ssi.txt create mode 100644 Documentation/devicetree/bindings/powerpc/nintendo/gamecube.txt create mode 100644 Documentation/devicetree/bindings/powerpc/nintendo/wii.txt create mode 100644 Documentation/devicetree/bindings/spi/fsl-spi.txt create mode 100644 Documentation/devicetree/bindings/spi/spi-bus.txt create mode 100644 Documentation/devicetree/bindings/usb/fsl-usb.txt create mode 100644 Documentation/devicetree/bindings/usb/usb-ehci.txt create mode 100644 Documentation/devicetree/bindings/xilinx.txt create mode 100644 Documentation/devicetree/booting-without-of.txt delete mode 100644 Documentation/powerpc/booting-without-of.txt delete mode 100644 Documentation/powerpc/dts-bindings/4xx/cpm.txt delete mode 100644 Documentation/powerpc/dts-bindings/4xx/emac.txt delete mode 100644 Documentation/powerpc/dts-bindings/4xx/ndfc.txt delete mode 100644 Documentation/powerpc/dts-bindings/4xx/ppc440spe-adma.txt delete mode 100644 Documentation/powerpc/dts-bindings/4xx/reboot.txt delete mode 100644 Documentation/powerpc/dts-bindings/can/sja1000.txt delete mode 100644 Documentation/powerpc/dts-bindings/ecm.txt delete mode 100644 Documentation/powerpc/dts-bindings/eeprom.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/83xx-512x-pci.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/8xxx_gpio.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/board.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/can.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm/brg.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm/i2c.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm/pic.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm/usb.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/cpm_qe/gpio.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/cpm_qe/network.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe/firmware.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe/par_io.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe/pincfg.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe/ucc.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe/usb.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/cpm_qe/serial.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/diu.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/dma.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/esdhc.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/gtm.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/guts.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/i2c.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/lbc.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/mcm.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/mcu-mpc8349emitx.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/mpc5121-psc.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/mpc5200.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/mpic.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/msi-pic.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/pmc.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/sata.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/sec.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/spi.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/ssi.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/tsec.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/upm-nand.txt delete mode 100644 Documentation/powerpc/dts-bindings/fsl/usb.txt delete mode 100644 Documentation/powerpc/dts-bindings/gpio/gpio.txt delete mode 100644 Documentation/powerpc/dts-bindings/gpio/led.txt delete mode 100644 Documentation/powerpc/dts-bindings/gpio/mdio.txt delete mode 100644 Documentation/powerpc/dts-bindings/marvell.txt delete mode 100644 Documentation/powerpc/dts-bindings/mmc-spi-slot.txt delete mode 100644 Documentation/powerpc/dts-bindings/mtd-physmap.txt delete mode 100644 Documentation/powerpc/dts-bindings/nintendo/gamecube.txt delete mode 100644 Documentation/powerpc/dts-bindings/nintendo/wii.txt delete mode 100644 Documentation/powerpc/dts-bindings/phy.txt delete mode 100644 Documentation/powerpc/dts-bindings/spi-bus.txt delete mode 100644 Documentation/powerpc/dts-bindings/usb-ehci.txt delete mode 100644 Documentation/powerpc/dts-bindings/xilinx.txt (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/ata/fsl-sata.txt b/Documentation/devicetree/bindings/ata/fsl-sata.txt new file mode 100644 index 000000000000..b46bcf46c3d8 --- /dev/null +++ b/Documentation/devicetree/bindings/ata/fsl-sata.txt @@ -0,0 +1,29 @@ +* Freescale 8xxx/3.0 Gb/s SATA nodes + +SATA nodes are defined to describe on-chip Serial ATA controllers. +Each SATA port should have its own node. + +Required properties: +- compatible : compatible list, contains 2 entries, first is + "fsl,CHIP-sata", where CHIP is the processor + (mpc8315, mpc8379, etc.) and the second is + "fsl,pq-sata" +- interrupts : +- cell-index : controller index. + 1 for controller @ 0x18000 + 2 for controller @ 0x19000 + 3 for controller @ 0x1a000 + 4 for controller @ 0x1b000 + +Optional properties: +- interrupt-parent : optional, if needed for interrupt mapping +- reg : + +Example: + sata@18000 { + compatible = "fsl,mpc8379-sata", "fsl,pq-sata"; + reg = <0x18000 0x1000>; + cell-index = <1>; + interrupts = <2c 8>; + interrupt-parent = < &ipic >; + }; diff --git a/Documentation/devicetree/bindings/eeprom.txt b/Documentation/devicetree/bindings/eeprom.txt new file mode 100644 index 000000000000..4342c10de1bf --- /dev/null +++ b/Documentation/devicetree/bindings/eeprom.txt @@ -0,0 +1,28 @@ +EEPROMs (I2C) + +Required properties: + + - compatible : should be "," + If there is no specific driver for , a generic + driver based on is selected. Possible types are: + 24c00, 24c01, 24c02, 24c04, 24c08, 24c16, 24c32, 24c64, + 24c128, 24c256, 24c512, 24c1024, spd + + - reg : the I2C address of the EEPROM + +Optional properties: + + - pagesize : the length of the pagesize for writing. Please consult the + manual of your device, that value varies a lot. A wrong value + may result in data loss! If not specified, a safety value of + '1' is used which will be very slow. + + - read-only: this parameterless property disables writes to the eeprom + +Example: + +eeprom@52 { + compatible = "atmel,24c32"; + reg = <0x52>; + pagesize = <32>; +}; diff --git a/Documentation/devicetree/bindings/gpio/8xxx_gpio.txt b/Documentation/devicetree/bindings/gpio/8xxx_gpio.txt new file mode 100644 index 000000000000..b0019eb5330e --- /dev/null +++ b/Documentation/devicetree/bindings/gpio/8xxx_gpio.txt @@ -0,0 +1,60 @@ +GPIO controllers on MPC8xxx SoCs + +This is for the non-QE/CPM/GUTs GPIO controllers as found on +8349, 8572, 8610 and compatible. + +Every GPIO controller node must have #gpio-cells property defined, +this information will be used to translate gpio-specifiers. + +Required properties: +- compatible : "fsl,-gpio" followed by "fsl,mpc8349-gpio" for + 83xx, "fsl,mpc8572-gpio" for 85xx and "fsl,mpc8610-gpio" for 86xx. +- #gpio-cells : Should be two. The first cell is the pin number and the + second cell is used to specify optional parameters (currently unused). + - interrupts : Interrupt mapping for GPIO IRQ. + - interrupt-parent : Phandle for the interrupt controller that + services interrupts for this device. +- gpio-controller : Marks the port as GPIO controller. + +Example of gpio-controller nodes for a MPC8347 SoC: + + gpio1: gpio-controller@c00 { + #gpio-cells = <2>; + compatible = "fsl,mpc8347-gpio", "fsl,mpc8349-gpio"; + reg = <0xc00 0x100>; + interrupts = <74 0x8>; + interrupt-parent = <&ipic>; + gpio-controller; + }; + + gpio2: gpio-controller@d00 { + #gpio-cells = <2>; + compatible = "fsl,mpc8347-gpio", "fsl,mpc8349-gpio"; + reg = <0xd00 0x100>; + interrupts = <75 0x8>; + interrupt-parent = <&ipic>; + gpio-controller; + }; + +See booting-without-of.txt for details of how to specify GPIO +information for devices. + +To use GPIO pins as interrupt sources for peripherals, specify the +GPIO controller as the interrupt parent and define GPIO number + +trigger mode using the interrupts property, which is defined like +this: + +interrupts = , where: + - number: GPIO pin (0..31) + - trigger: trigger mode: + 2 = trigger on falling edge + 3 = trigger on both edges + +Example of device using this is: + + funkyfpga@0 { + compatible = "funky-fpga"; + ... + interrupts = <4 3>; + interrupt-parent = <&gpio1>; + }; diff --git a/Documentation/devicetree/bindings/gpio/gpio.txt b/Documentation/devicetree/bindings/gpio/gpio.txt new file mode 100644 index 000000000000..edaa84d288a1 --- /dev/null +++ b/Documentation/devicetree/bindings/gpio/gpio.txt @@ -0,0 +1,50 @@ +Specifying GPIO information for devices +============================================ + +1) gpios property +----------------- + +Nodes that makes use of GPIOs should define them using `gpios' property, +format of which is: <&gpio-controller1-phandle gpio1-specifier + &gpio-controller2-phandle gpio2-specifier + 0 /* holes are permitted, means no GPIO 3 */ + &gpio-controller4-phandle gpio4-specifier + ...>; + +Note that gpio-specifier length is controller dependent. + +gpio-specifier may encode: bank, pin position inside the bank, +whether pin is open-drain and whether pin is logically inverted. + +Example of the node using GPIOs: + + node { + gpios = <&qe_pio_e 18 0>; + }; + +In this example gpio-specifier is "18 0" and encodes GPIO pin number, +and empty GPIO flags as accepted by the "qe_pio_e" gpio-controller. + +2) gpio-controller nodes +------------------------ + +Every GPIO controller node must have #gpio-cells property defined, +this information will be used to translate gpio-specifiers. + +Example of two SOC GPIO banks defined as gpio-controller nodes: + + qe_pio_a: gpio-controller@1400 { + #gpio-cells = <2>; + compatible = "fsl,qe-pario-bank-a", "fsl,qe-pario-bank"; + reg = <0x1400 0x18>; + gpio-controller; + }; + + qe_pio_e: gpio-controller@1460 { + #gpio-cells = <2>; + compatible = "fsl,qe-pario-bank-e", "fsl,qe-pario-bank"; + reg = <0x1460 0x18>; + gpio-controller; + }; + + diff --git a/Documentation/devicetree/bindings/gpio/led.txt b/Documentation/devicetree/bindings/gpio/led.txt new file mode 100644 index 000000000000..064db928c3c1 --- /dev/null +++ b/Documentation/devicetree/bindings/gpio/led.txt @@ -0,0 +1,58 @@ +LEDs connected to GPIO lines + +Required properties: +- compatible : should be "gpio-leds". + +Each LED is represented as a sub-node of the gpio-leds device. Each +node's name represents the name of the corresponding LED. + +LED sub-node properties: +- gpios : Should specify the LED's GPIO, see "Specifying GPIO information + for devices" in Documentation/powerpc/booting-without-of.txt. Active + low LEDs should be indicated using flags in the GPIO specifier. +- label : (optional) The label for this LED. If omitted, the label is + taken from the node name (excluding the unit address). +- linux,default-trigger : (optional) This parameter, if present, is a + string defining the trigger assigned to the LED. Current triggers are: + "backlight" - LED will act as a back-light, controlled by the framebuffer + system + "default-on" - LED will turn on, but see "default-state" below + "heartbeat" - LED "double" flashes at a load average based rate + "ide-disk" - LED indicates disk activity + "timer" - LED flashes at a fixed, configurable rate +- default-state: (optional) The initial state of the LED. Valid + values are "on", "off", and "keep". If the LED is already on or off + and the default-state property is set the to same value, then no + glitch should be produced where the LED momentarily turns off (or + on). The "keep" setting will keep the LED at whatever its current + state is, without producing a glitch. The default is off if this + property is not present. + +Examples: + +leds { + compatible = "gpio-leds"; + hdd { + label = "IDE Activity"; + gpios = <&mcu_pio 0 1>; /* Active low */ + linux,default-trigger = "ide-disk"; + }; + + fault { + gpios = <&mcu_pio 1 0>; + /* Keep LED on if BIOS detected hardware fault */ + default-state = "keep"; + }; +}; + +run-control { + compatible = "gpio-leds"; + red { + gpios = <&mpc8572 6 0>; + default-state = "off"; + }; + green { + gpios = <&mpc8572 7 0>; + default-state = "on"; + }; +} diff --git a/Documentation/devicetree/bindings/i2c/fsl-i2c.txt b/Documentation/devicetree/bindings/i2c/fsl-i2c.txt new file mode 100644 index 000000000000..1eacd6b20ed5 --- /dev/null +++ b/Documentation/devicetree/bindings/i2c/fsl-i2c.txt @@ -0,0 +1,64 @@ +* I2C + +Required properties : + + - reg : Offset and length of the register set for the device + - compatible : should be "fsl,CHIP-i2c" where CHIP is the name of a + compatible processor, e.g. mpc8313, mpc8543, mpc8544, mpc5121, + mpc5200 or mpc5200b. For the mpc5121, an additional node + "fsl,mpc5121-i2c-ctrl" is required as shown in the example below. + +Recommended properties : + + - interrupts : where a is the interrupt number and b is a + field that represents an encoding of the sense and level + information for the interrupt. This should be encoded based on + the information in section 2) depending on the type of interrupt + controller you have. + - interrupt-parent : the phandle for the interrupt controller that + services interrupts for this device. + - fsl,preserve-clocking : boolean; if defined, the clock settings + from the bootloader are preserved (not touched). + - clock-frequency : desired I2C bus clock frequency in Hz. + - fsl,timeout : I2C bus timeout in microseconds. + +Examples : + + /* MPC5121 based board */ + i2c@1740 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "fsl,mpc5121-i2c", "fsl-i2c"; + reg = <0x1740 0x20>; + interrupts = <11 0x8>; + interrupt-parent = <&ipic>; + clock-frequency = <100000>; + }; + + i2ccontrol@1760 { + compatible = "fsl,mpc5121-i2c-ctrl"; + reg = <0x1760 0x8>; + }; + + /* MPC5200B based board */ + i2c@3d00 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "fsl,mpc5200b-i2c","fsl,mpc5200-i2c","fsl-i2c"; + reg = <0x3d00 0x40>; + interrupts = <2 15 0>; + interrupt-parent = <&mpc5200_pic>; + fsl,preserve-clocking; + }; + + /* MPC8544 base board */ + i2c@3100 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "fsl,mpc8544-i2c", "fsl-i2c"; + reg = <0x3100 0x100>; + interrupts = <43 2>; + interrupt-parent = <&mpic>; + clock-frequency = <400000>; + fsl,timeout = <10000>; + }; diff --git a/Documentation/devicetree/bindings/marvell.txt b/Documentation/devicetree/bindings/marvell.txt new file mode 100644 index 000000000000..f1533d91953a --- /dev/null +++ b/Documentation/devicetree/bindings/marvell.txt @@ -0,0 +1,521 @@ +Marvell Discovery mv64[345]6x System Controller chips +=========================================================== + +The Marvell mv64[345]60 series of system controller chips contain +many of the peripherals needed to implement a complete computer +system. In this section, we define device tree nodes to describe +the system controller chip itself and each of the peripherals +which it contains. Compatible string values for each node are +prefixed with the string "marvell,", for Marvell Technology Group Ltd. + +1) The /system-controller node + + This node is used to represent the system-controller and must be + present when the system uses a system controller chip. The top-level + system-controller node contains information that is global to all + devices within the system controller chip. The node name begins + with "system-controller" followed by the unit address, which is + the base address of the memory-mapped register set for the system + controller chip. + + Required properties: + + - ranges : Describes the translation of system controller addresses + for memory mapped registers. + - clock-frequency: Contains the main clock frequency for the system + controller chip. + - reg : This property defines the address and size of the + memory-mapped registers contained within the system controller + chip. The address specified in the "reg" property should match + the unit address of the system-controller node. + - #address-cells : Address representation for system controller + devices. This field represents the number of cells needed to + represent the address of the memory-mapped registers of devices + within the system controller chip. + - #size-cells : Size representation for the memory-mapped + registers within the system controller chip. + - #interrupt-cells : Defines the width of cells used to represent + interrupts. + + Optional properties: + + - model : The specific model of the system controller chip. Such + as, "mv64360", "mv64460", or "mv64560". + - compatible : A string identifying the compatibility identifiers + of the system controller chip. + + The system-controller node contains child nodes for each system + controller device that the platform uses. Nodes should not be created + for devices which exist on the system controller chip but are not used + + Example Marvell Discovery mv64360 system-controller node: + + system-controller@f1000000 { /* Marvell Discovery mv64360 */ + #address-cells = <1>; + #size-cells = <1>; + model = "mv64360"; /* Default */ + compatible = "marvell,mv64360"; + clock-frequency = <133333333>; + reg = <0xf1000000 0x10000>; + virtual-reg = <0xf1000000>; + ranges = <0x88000000 0x88000000 0x1000000 /* PCI 0 I/O Space */ + 0x80000000 0x80000000 0x8000000 /* PCI 0 MEM Space */ + 0xa0000000 0xa0000000 0x4000000 /* User FLASH */ + 0x00000000 0xf1000000 0x0010000 /* Bridge's regs */ + 0xf2000000 0xf2000000 0x0040000>;/* Integrated SRAM */ + + [ child node definitions... ] + } + +2) Child nodes of /system-controller + + a) Marvell Discovery MDIO bus + + The MDIO is a bus to which the PHY devices are connected. For each + device that exists on this bus, a child node should be created. See + the definition of the PHY node below for an example of how to define + a PHY. + + Required properties: + - #address-cells : Should be <1> + - #size-cells : Should be <0> + - device_type : Should be "mdio" + - compatible : Should be "marvell,mv64360-mdio" + + Example: + + mdio { + #address-cells = <1>; + #size-cells = <0>; + device_type = "mdio"; + compatible = "marvell,mv64360-mdio"; + + ethernet-phy@0 { + ...... + }; + }; + + + b) Marvell Discovery ethernet controller + + The Discover ethernet controller is described with two levels + of nodes. The first level describes an ethernet silicon block + and the second level describes up to 3 ethernet nodes within + that block. The reason for the multiple levels is that the + registers for the node are interleaved within a single set + of registers. The "ethernet-block" level describes the + shared register set, and the "ethernet" nodes describe ethernet + port-specific properties. + + Ethernet block node + + Required properties: + - #address-cells : <1> + - #size-cells : <0> + - compatible : "marvell,mv64360-eth-block" + - reg : Offset and length of the register set for this block + + Example Discovery Ethernet block node: + ethernet-block@2000 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "marvell,mv64360-eth-block"; + reg = <0x2000 0x2000>; + ethernet@0 { + ....... + }; + }; + + Ethernet port node + + Required properties: + - device_type : Should be "network". + - compatible : Should be "marvell,mv64360-eth". + - reg : Should be <0>, <1>, or <2>, according to which registers + within the silicon block the device uses. + - interrupts : where a is the interrupt number for the port. + - interrupt-parent : the phandle for the interrupt controller + that services interrupts for this device. + - phy : the phandle for the PHY connected to this ethernet + controller. + - local-mac-address : 6 bytes, MAC address + + Example Discovery Ethernet port node: + ethernet@0 { + device_type = "network"; + compatible = "marvell,mv64360-eth"; + reg = <0>; + interrupts = <32>; + interrupt-parent = <&PIC>; + phy = <&PHY0>; + local-mac-address = [ 00 00 00 00 00 00 ]; + }; + + + + c) Marvell Discovery PHY nodes + + Required properties: + - device_type : Should be "ethernet-phy" + - interrupts : where a is the interrupt number for this phy. + - interrupt-parent : the phandle for the interrupt controller that + services interrupts for this device. + - reg : The ID number for the phy, usually a small integer + + Example Discovery PHY node: + ethernet-phy@1 { + device_type = "ethernet-phy"; + compatible = "broadcom,bcm5421"; + interrupts = <76>; /* GPP 12 */ + interrupt-parent = <&PIC>; + reg = <1>; + }; + + + d) Marvell Discovery SDMA nodes + + Represent DMA hardware associated with the MPSC (multiprotocol + serial controllers). + + Required properties: + - compatible : "marvell,mv64360-sdma" + - reg : Offset and length of the register set for this device + - interrupts : where a is the interrupt number for the DMA + device. + - interrupt-parent : the phandle for the interrupt controller + that services interrupts for this device. + + Example Discovery SDMA node: + sdma@4000 { + compatible = "marvell,mv64360-sdma"; + reg = <0x4000 0xc18>; + virtual-reg = <0xf1004000>; + interrupts = <36>; + interrupt-parent = <&PIC>; + }; + + + e) Marvell Discovery BRG nodes + + Represent baud rate generator hardware associated with the MPSC + (multiprotocol serial controllers). + + Required properties: + - compatible : "marvell,mv64360-brg" + - reg : Offset and length of the register set for this device + - clock-src : A value from 0 to 15 which selects the clock + source for the baud rate generator. This value corresponds + to the CLKS value in the BRGx configuration register. See + the mv64x60 User's Manual. + - clock-frequence : The frequency (in Hz) of the baud rate + generator's input clock. + - current-speed : The current speed setting (presumably by + firmware) of the baud rate generator. + + Example Discovery BRG node: + brg@b200 { + compatible = "marvell,mv64360-brg"; + reg = <0xb200 0x8>; + clock-src = <8>; + clock-frequency = <133333333>; + current-speed = <9600>; + }; + + + f) Marvell Discovery CUNIT nodes + + Represent the Serial Communications Unit device hardware. + + Required properties: + - reg : Offset and length of the register set for this device + + Example Discovery CUNIT node: + cunit@f200 { + reg = <0xf200 0x200>; + }; + + + g) Marvell Discovery MPSCROUTING nodes + + Represent the Discovery's MPSC routing hardware + + Required properties: + - reg : Offset and length of the register set for this device + + Example Discovery CUNIT node: + mpscrouting@b500 { + reg = <0xb400 0xc>; + }; + + + h) Marvell Discovery MPSCINTR nodes + + Represent the Discovery's MPSC DMA interrupt hardware registers + (SDMA cause and mask registers). + + Required properties: + - reg : Offset and length of the register set for this device + + Example Discovery MPSCINTR node: + mpsintr@b800 { + reg = <0xb800 0x100>; + }; + + + i) Marvell Discovery MPSC nodes + + Represent the Discovery's MPSC (Multiprotocol Serial Controller) + serial port. + + Required properties: + - device_type : "serial" + - compatible : "marvell,mv64360-mpsc" + - reg : Offset and length of the register set for this device + - sdma : the phandle for the SDMA node used by this port + - brg : the phandle for the BRG node used by this port + - cunit : the phandle for the CUNIT node used by this port + - mpscrouting : the phandle for the MPSCROUTING node used by this port + - mpscintr : the phandle for the MPSCINTR node used by this port + - cell-index : the hardware index of this cell in the MPSC core + - max_idle : value needed for MPSC CHR3 (Maximum Frame Length) + register + - interrupts : where a is the interrupt number for the MPSC. + - interrupt-parent : the phandle for the interrupt controller + that services interrupts for this device. + + Example Discovery MPSCINTR node: + mpsc@8000 { + device_type = "serial"; + compatible = "marvell,mv64360-mpsc"; + reg = <0x8000 0x38>; + virtual-reg = <0xf1008000>; + sdma = <&SDMA0>; + brg = <&BRG0>; + cunit = <&CUNIT>; + mpscrouting = <&MPSCROUTING>; + mpscintr = <&MPSCINTR>; + cell-index = <0>; + max_idle = <40>; + interrupts = <40>; + interrupt-parent = <&PIC>; + }; + + + j) Marvell Discovery Watch Dog Timer nodes + + Represent the Discovery's watchdog timer hardware + + Required properties: + - compatible : "marvell,mv64360-wdt" + - reg : Offset and length of the register set for this device + + Example Discovery Watch Dog Timer node: + wdt@b410 { + compatible = "marvell,mv64360-wdt"; + reg = <0xb410 0x8>; + }; + + + k) Marvell Discovery I2C nodes + + Represent the Discovery's I2C hardware + + Required properties: + - device_type : "i2c" + - compatible : "marvell,mv64360-i2c" + - reg : Offset and length of the register set for this device + - interrupts : where a is the interrupt number for the I2C. + - interrupt-parent : the phandle for the interrupt controller + that services interrupts for this device. + + Example Discovery I2C node: + compatible = "marvell,mv64360-i2c"; + reg = <0xc000 0x20>; + virtual-reg = <0xf100c000>; + interrupts = <37>; + interrupt-parent = <&PIC>; + }; + + + l) Marvell Discovery PIC (Programmable Interrupt Controller) nodes + + Represent the Discovery's PIC hardware + + Required properties: + - #interrupt-cells : <1> + - #address-cells : <0> + - compatible : "marvell,mv64360-pic" + - reg : Offset and length of the register set for this device + - interrupt-controller + + Example Discovery PIC node: + pic { + #interrupt-cells = <1>; + #address-cells = <0>; + compatible = "marvell,mv64360-pic"; + reg = <0x0 0x88>; + interrupt-controller; + }; + + + m) Marvell Discovery MPP (Multipurpose Pins) multiplexing nodes + + Represent the Discovery's MPP hardware + + Required properties: + - compatible : "marvell,mv64360-mpp" + - reg : Offset and length of the register set for this device + + Example Discovery MPP node: + mpp@f000 { + compatible = "marvell,mv64360-mpp"; + reg = <0xf000 0x10>; + }; + + + n) Marvell Discovery GPP (General Purpose Pins) nodes + + Represent the Discovery's GPP hardware + + Required properties: + - compatible : "marvell,mv64360-gpp" + - reg : Offset and length of the register set for this device + + Example Discovery GPP node: + gpp@f000 { + compatible = "marvell,mv64360-gpp"; + reg = <0xf100 0x20>; + }; + + + o) Marvell Discovery PCI host bridge node + + Represents the Discovery's PCI host bridge device. The properties + for this node conform to Rev 2.1 of the PCI Bus Binding to IEEE + 1275-1994. A typical value for the compatible property is + "marvell,mv64360-pci". + + Example Discovery PCI host bridge node + pci@80000000 { + #address-cells = <3>; + #size-cells = <2>; + #interrupt-cells = <1>; + device_type = "pci"; + compatible = "marvell,mv64360-pci"; + reg = <0xcf8 0x8>; + ranges = <0x01000000 0x0 0x0 + 0x88000000 0x0 0x01000000 + 0x02000000 0x0 0x80000000 + 0x80000000 0x0 0x08000000>; + bus-range = <0 255>; + clock-frequency = <66000000>; + interrupt-parent = <&PIC>; + interrupt-map-mask = <0xf800 0x0 0x0 0x7>; + interrupt-map = < + /* IDSEL 0x0a */ + 0x5000 0 0 1 &PIC 80 + 0x5000 0 0 2 &PIC 81 + 0x5000 0 0 3 &PIC 91 + 0x5000 0 0 4 &PIC 93 + + /* IDSEL 0x0b */ + 0x5800 0 0 1 &PIC 91 + 0x5800 0 0 2 &PIC 93 + 0x5800 0 0 3 &PIC 80 + 0x5800 0 0 4 &PIC 81 + + /* IDSEL 0x0c */ + 0x6000 0 0 1 &PIC 91 + 0x6000 0 0 2 &PIC 93 + 0x6000 0 0 3 &PIC 80 + 0x6000 0 0 4 &PIC 81 + + /* IDSEL 0x0d */ + 0x6800 0 0 1 &PIC 93 + 0x6800 0 0 2 &PIC 80 + 0x6800 0 0 3 &PIC 81 + 0x6800 0 0 4 &PIC 91 + >; + }; + + + p) Marvell Discovery CPU Error nodes + + Represent the Discovery's CPU error handler device. + + Required properties: + - compatible : "marvell,mv64360-cpu-error" + - reg : Offset and length of the register set for this device + - interrupts : the interrupt number for this device + - interrupt-parent : the phandle for the interrupt controller + that services interrupts for this device. + + Example Discovery CPU Error node: + cpu-error@0070 { + compatible = "marvell,mv64360-cpu-error"; + reg = <0x70 0x10 0x128 0x28>; + interrupts = <3>; + interrupt-parent = <&PIC>; + }; + + + q) Marvell Discovery SRAM Controller nodes + + Represent the Discovery's SRAM controller device. + + Required properties: + - compatible : "marvell,mv64360-sram-ctrl" + - reg : Offset and length of the register set for this device + - interrupts : the interrupt number for this device + - interrupt-parent : the phandle for the interrupt controller + that services interrupts for this device. + + Example Discovery SRAM Controller node: + sram-ctrl@0380 { + compatible = "marvell,mv64360-sram-ctrl"; + reg = <0x380 0x80>; + interrupts = <13>; + interrupt-parent = <&PIC>; + }; + + + r) Marvell Discovery PCI Error Handler nodes + + Represent the Discovery's PCI error handler device. + + Required properties: + - compatible : "marvell,mv64360-pci-error" + - reg : Offset and length of the register set for this device + - interrupts : the interrupt number for this device + - interrupt-parent : the phandle for the interrupt controller + that services interrupts for this device. + + Example Discovery PCI Error Handler node: + pci-error@1d40 { + compatible = "marvell,mv64360-pci-error"; + reg = <0x1d40 0x40 0xc28 0x4>; + interrupts = <12>; + interrupt-parent = <&PIC>; + }; + + + s) Marvell Discovery Memory Controller nodes + + Represent the Discovery's memory controller device. + + Required properties: + - compatible : "marvell,mv64360-mem-ctrl" + - reg : Offset and length of the register set for this device + - interrupts : the interrupt number for this device + - interrupt-parent : the phandle for the interrupt controller + that services interrupts for this device. + + Example Discovery Memory Controller node: + mem-ctrl@1400 { + compatible = "marvell,mv64360-mem-ctrl"; + reg = <0x1400 0x60>; + interrupts = <17>; + interrupt-parent = <&PIC>; + }; + + diff --git a/Documentation/devicetree/bindings/mmc/fsl-esdhc.txt b/Documentation/devicetree/bindings/mmc/fsl-esdhc.txt new file mode 100644 index 000000000000..64bcb8be973c --- /dev/null +++ b/Documentation/devicetree/bindings/mmc/fsl-esdhc.txt @@ -0,0 +1,29 @@ +* Freescale Enhanced Secure Digital Host Controller (eSDHC) + +The Enhanced Secure Digital Host Controller provides an interface +for MMC, SD, and SDIO types of memory cards. + +Required properties: + - compatible : should be + "fsl,-esdhc", "fsl,esdhc" + - reg : should contain eSDHC registers location and length. + - interrupts : should contain eSDHC interrupt. + - interrupt-parent : interrupt source phandle. + - clock-frequency : specifies eSDHC base clock frequency. + - sdhci,wp-inverted : (optional) specifies that eSDHC controller + reports inverted write-protect state; + - sdhci,1-bit-only : (optional) specifies that a controller can + only handle 1-bit data transfers. + - sdhci,auto-cmd12: (optional) specifies that a controller can + only handle auto CMD12. + +Example: + +sdhci@2e000 { + compatible = "fsl,mpc8378-esdhc", "fsl,esdhc"; + reg = <0x2e000 0x1000>; + interrupts = <42 0x8>; + interrupt-parent = <&ipic>; + /* Filled in by U-Boot */ + clock-frequency = <0>; +}; diff --git a/Documentation/devicetree/bindings/mmc/mmc-spi-slot.txt b/Documentation/devicetree/bindings/mmc/mmc-spi-slot.txt new file mode 100644 index 000000000000..c39ac2891951 --- /dev/null +++ b/Documentation/devicetree/bindings/mmc/mmc-spi-slot.txt @@ -0,0 +1,23 @@ +MMC/SD/SDIO slot directly connected to a SPI bus + +Required properties: +- compatible : should be "mmc-spi-slot". +- reg : should specify SPI address (chip-select number). +- spi-max-frequency : maximum frequency for this device (Hz). +- voltage-ranges : two cells are required, first cell specifies minimum + slot voltage (mV), second cell specifies maximum slot voltage (mV). + Several ranges could be specified. +- gpios : (optional) may specify GPIOs in this order: Card-Detect GPIO, + Write-Protect GPIO. + +Example: + + mmc-slot@0 { + compatible = "fsl,mpc8323rdb-mmc-slot", + "mmc-spi-slot"; + reg = <0>; + gpios = <&qe_pio_d 14 1 + &qe_pio_d 15 0>; + voltage-ranges = <3300 3300>; + spi-max-frequency = <50000000>; + }; diff --git a/Documentation/devicetree/bindings/mtd/fsl-upm-nand.txt b/Documentation/devicetree/bindings/mtd/fsl-upm-nand.txt new file mode 100644 index 000000000000..a48b2cadc7f0 --- /dev/null +++ b/Documentation/devicetree/bindings/mtd/fsl-upm-nand.txt @@ -0,0 +1,63 @@ +Freescale Localbus UPM programmed to work with NAND flash + +Required properties: +- compatible : "fsl,upm-nand". +- reg : should specify localbus chip select and size used for the chip. +- fsl,upm-addr-offset : UPM pattern offset for the address latch. +- fsl,upm-cmd-offset : UPM pattern offset for the command latch. + +Optional properties: +- fsl,upm-wait-flags : add chip-dependent short delays after running the + UPM pattern (0x1), after writing a data byte (0x2) or after + writing out a buffer (0x4). +- fsl,upm-addr-line-cs-offsets : address offsets for multi-chip support. + The corresponding address lines are used to select the chip. +- gpios : may specify optional GPIOs connected to the Ready-Not-Busy pins + (R/B#). For multi-chip devices, "n" GPIO definitions are required + according to the number of chips. +- chip-delay : chip dependent delay for transfering data from array to + read registers (tR). Required if property "gpios" is not used + (R/B# pins not connected). + +Examples: + +upm@1,0 { + compatible = "fsl,upm-nand"; + reg = <1 0 1>; + fsl,upm-addr-offset = <16>; + fsl,upm-cmd-offset = <8>; + gpios = <&qe_pio_e 18 0>; + + flash { + #address-cells = <1>; + #size-cells = <1>; + compatible = "..."; + + partition@0 { + ... + }; + }; +}; + +upm@3,0 { + #address-cells = <0>; + #size-cells = <0>; + compatible = "tqc,tqm8548-upm-nand", "fsl,upm-nand"; + reg = <3 0x0 0x800>; + fsl,upm-addr-offset = <0x10>; + fsl,upm-cmd-offset = <0x08>; + /* Multi-chip NAND device */ + fsl,upm-addr-line-cs-offsets = <0x0 0x200>; + fsl,upm-wait-flags = <0x5>; + chip-delay = <25>; // in micro-seconds + + nand@0 { + #address-cells = <1>; + #size-cells = <1>; + + partition@0 { + label = "fs"; + reg = <0x00000000 0x10000000>; + }; + }; +}; diff --git a/Documentation/devicetree/bindings/mtd/mtd-physmap.txt b/Documentation/devicetree/bindings/mtd/mtd-physmap.txt new file mode 100644 index 000000000000..80152cb567d9 --- /dev/null +++ b/Documentation/devicetree/bindings/mtd/mtd-physmap.txt @@ -0,0 +1,90 @@ +CFI or JEDEC memory-mapped NOR flash, MTD-RAM (NVRAM...) + +Flash chips (Memory Technology Devices) are often used for solid state +file systems on embedded devices. + + - compatible : should contain the specific model of mtd chip(s) + used, if known, followed by either "cfi-flash", "jedec-flash" + or "mtd-ram". + - reg : Address range(s) of the mtd chip(s) + It's possible to (optionally) define multiple "reg" tuples so that + non-identical chips can be described in one node. + - bank-width : Width (in bytes) of the bank. Equal to the + device width times the number of interleaved chips. + - device-width : (optional) Width of a single mtd chip. If + omitted, assumed to be equal to 'bank-width'. + - #address-cells, #size-cells : Must be present if the device has + sub-nodes representing partitions (see below). In this case + both #address-cells and #size-cells must be equal to 1. + +For JEDEC compatible devices, the following additional properties +are defined: + + - vendor-id : Contains the flash chip's vendor id (1 byte). + - device-id : Contains the flash chip's device id (1 byte). + +In addition to the information on the mtd bank itself, the +device tree may optionally contain additional information +describing partitions of the address space. This can be +used on platforms which have strong conventions about which +portions of a flash are used for what purposes, but which don't +use an on-flash partition table such as RedBoot. + +Each partition is represented as a sub-node of the mtd device. +Each node's name represents the name of the corresponding +partition of the mtd device. + +Flash partitions + - reg : The partition's offset and size within the mtd bank. + - label : (optional) The label / name for this partition. + If omitted, the label is taken from the node name (excluding + the unit address). + - read-only : (optional) This parameter, if present, is a hint to + Linux that this partition should only be mounted + read-only. This is usually used for flash partitions + containing early-boot firmware images or data which should not + be clobbered. + +Example: + + flash@ff000000 { + compatible = "amd,am29lv128ml", "cfi-flash"; + reg = ; + bank-width = <4>; + device-width = <1>; + #address-cells = <1>; + #size-cells = <1>; + fs@0 { + label = "fs"; + reg = <0 f80000>; + }; + firmware@f80000 { + label ="firmware"; + reg = ; + read-only; + }; + }; + +Here an example with multiple "reg" tuples: + + flash@f0000000,0 { + #address-cells = <1>; + #size-cells = <1>; + compatible = "intel,PC48F4400P0VB", "cfi-flash"; + reg = <0 0x00000000 0x02000000 + 0 0x02000000 0x02000000>; + bank-width = <2>; + partition@0 { + label = "test-part1"; + reg = <0 0x04000000>; + }; + }; + +An example using SRAM: + + sram@2,0 { + compatible = "samsung,k6f1616u6a", "mtd-ram"; + reg = <2 0 0x00200000>; + bank-width = <2>; + }; + diff --git a/Documentation/devicetree/bindings/net/can/mpc5xxx-mscan.txt b/Documentation/devicetree/bindings/net/can/mpc5xxx-mscan.txt new file mode 100644 index 000000000000..2fa4fcd38fd6 --- /dev/null +++ b/Documentation/devicetree/bindings/net/can/mpc5xxx-mscan.txt @@ -0,0 +1,53 @@ +CAN Device Tree Bindings +------------------------ + +(c) 2006-2009 Secret Lab Technologies Ltd +Grant Likely + +fsl,mpc5200-mscan nodes +----------------------- +In addition to the required compatible-, reg- and interrupt-properties, you can +also specify which clock source shall be used for the controller: + +- fsl,mscan-clock-source : a string describing the clock source. Valid values + are: "ip" for ip bus clock + "ref" for reference clock (XTAL) + "ref" is default in case this property is not + present. + +fsl,mpc5121-mscan nodes +----------------------- +In addition to the required compatible-, reg- and interrupt-properties, you can +also specify which clock source and divider shall be used for the controller: + +- fsl,mscan-clock-source : a string describing the clock source. Valid values + are: "ip" for ip bus clock + "ref" for reference clock + "sys" for system clock + If this property is not present, an optimal CAN + clock source and frequency based on the system + clock will be selected. If this is not possible, + the reference clock will be used. + +- fsl,mscan-clock-divider: for the reference and system clock, an additional + clock divider can be specified. By default, a + value of 1 is used. + +Note that the MPC5121 Rev. 1 processor is not supported. + +Examples: + can@1300 { + compatible = "fsl,mpc5121-mscan"; + interrupts = <12 0x8>; + interrupt-parent = <&ipic>; + reg = <0x1300 0x80>; + }; + + can@1380 { + compatible = "fsl,mpc5121-mscan"; + interrupts = <13 0x8>; + interrupt-parent = <&ipic>; + reg = <0x1380 0x80>; + fsl,mscan-clock-source = "ref"; + fsl,mscan-clock-divider = <3>; + }; diff --git a/Documentation/devicetree/bindings/net/can/sja1000.txt b/Documentation/devicetree/bindings/net/can/sja1000.txt new file mode 100644 index 000000000000..d6d209ded937 --- /dev/null +++ b/Documentation/devicetree/bindings/net/can/sja1000.txt @@ -0,0 +1,53 @@ +Memory mapped SJA1000 CAN controller from NXP (formerly Philips) + +Required properties: + +- compatible : should be "nxp,sja1000". + +- reg : should specify the chip select, address offset and size required + to map the registers of the SJA1000. The size is usually 0x80. + +- interrupts: property with a value describing the interrupt source + (number and sensitivity) required for the SJA1000. + +Optional properties: + +- nxp,external-clock-frequency : Frequency of the external oscillator + clock in Hz. Note that the internal clock frequency used by the + SJA1000 is half of that value. If not specified, a default value + of 16000000 (16 MHz) is used. + +- nxp,tx-output-mode : operation mode of the TX output control logic: + <0x0> : bi-phase output mode + <0x1> : normal output mode (default) + <0x2> : test output mode + <0x3> : clock output mode + +- nxp,tx-output-config : TX output pin configuration: + <0x01> : TX0 invert + <0x02> : TX0 pull-down (default) + <0x04> : TX0 pull-up + <0x06> : TX0 push-pull + <0x08> : TX1 invert + <0x10> : TX1 pull-down + <0x20> : TX1 pull-up + <0x30> : TX1 push-pull + +- nxp,clock-out-frequency : clock frequency in Hz on the CLKOUT pin. + If not specified or if the specified value is 0, the CLKOUT pin + will be disabled. + +- nxp,no-comparator-bypass : Allows to disable the CAN input comperator. + +For futher information, please have a look to the SJA1000 data sheet. + +Examples: + +can@3,100 { + compatible = "nxp,sja1000"; + reg = <3 0x100 0x80>; + interrupts = <2 0>; + interrupt-parent = <&mpic>; + nxp,external-clock-frequency = <16000000>; +}; + diff --git a/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt b/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt new file mode 100644 index 000000000000..edb7ae19e868 --- /dev/null +++ b/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt @@ -0,0 +1,76 @@ +* MDIO IO device + +The MDIO is a bus to which the PHY devices are connected. For each +device that exists on this bus, a child node should be created. See +the definition of the PHY node in booting-without-of.txt for an example +of how to define a PHY. + +Required properties: + - reg : Offset and length of the register set for the device + - compatible : Should define the compatible device type for the + mdio. Currently, this is most likely to be "fsl,gianfar-mdio" + +Example: + + mdio@24520 { + reg = <24520 20>; + compatible = "fsl,gianfar-mdio"; + + ethernet-phy@0 { + ...... + }; + }; + +* TBI Internal MDIO bus + +As of this writing, every tsec is associated with an internal TBI PHY. +This PHY is accessed through the local MDIO bus. These buses are defined +similarly to the mdio buses, except they are compatible with "fsl,gianfar-tbi". +The TBI PHYs underneath them are similar to normal PHYs, but the reg property +is considered instructive, rather than descriptive. The reg property should +be chosen so it doesn't interfere with other PHYs on the bus. + +* Gianfar-compatible ethernet nodes + +Properties: + + - device_type : Should be "network" + - model : Model of the device. Can be "TSEC", "eTSEC", or "FEC" + - compatible : Should be "gianfar" + - reg : Offset and length of the register set for the device + - local-mac-address : List of bytes representing the ethernet address of + this controller + - interrupts : For FEC devices, the first interrupt is the device's + interrupt. For TSEC and eTSEC devices, the first interrupt is + transmit, the second is receive, and the third is error. + - phy-handle : The phandle for the PHY connected to this ethernet + controller. + - fixed-link : where a is emulated phy id - choose any, + but unique to the all specified fixed-links, b is duplex - 0 half, + 1 full, c is link speed - d#10/d#100/d#1000, d is pause - 0 no + pause, 1 pause, e is asym_pause - 0 no asym_pause, 1 asym_pause. + - phy-connection-type : a string naming the controller/PHY interface type, + i.e., "mii" (default), "rmii", "gmii", "rgmii", "rgmii-id", "sgmii", + "tbi", or "rtbi". This property is only really needed if the connection + is of type "rgmii-id", as all other connection types are detected by + hardware. + - fsl,magic-packet : If present, indicates that the hardware supports + waking up via magic packet. + - bd-stash : If present, indicates that the hardware supports stashing + buffer descriptors in the L2. + - rx-stash-len : Denotes the number of bytes of a received buffer to stash + in the L2. + - rx-stash-idx : Denotes the index of the first byte from the received + buffer to stash in the L2. + +Example: + ethernet@24000 { + device_type = "network"; + model = "TSEC"; + compatible = "gianfar"; + reg = <0x24000 0x1000>; + local-mac-address = [ 00 E0 0C 00 73 00 ]; + interrupts = <29 2 30 2 34 2>; + interrupt-parent = <&mpic>; + phy-handle = <&phy0> + }; diff --git a/Documentation/devicetree/bindings/net/mdio-gpio.txt b/Documentation/devicetree/bindings/net/mdio-gpio.txt new file mode 100644 index 000000000000..bc9549529014 --- /dev/null +++ b/Documentation/devicetree/bindings/net/mdio-gpio.txt @@ -0,0 +1,19 @@ +MDIO on GPIOs + +Currently defined compatibles: +- virtual,gpio-mdio + +MDC and MDIO lines connected to GPIO controllers are listed in the +gpios property as described in section VIII.1 in the following order: + +MDC, MDIO. + +Example: + +mdio { + compatible = "virtual,mdio-gpio"; + #address-cells = <1>; + #size-cells = <0>; + gpios = <&qe_pio_a 11 + &qe_pio_c 6>; +}; diff --git a/Documentation/devicetree/bindings/net/phy.txt b/Documentation/devicetree/bindings/net/phy.txt new file mode 100644 index 000000000000..bb8c742eb8c5 --- /dev/null +++ b/Documentation/devicetree/bindings/net/phy.txt @@ -0,0 +1,25 @@ +PHY nodes + +Required properties: + + - device_type : Should be "ethernet-phy" + - interrupts : where a is the interrupt number and b is a + field that represents an encoding of the sense and level + information for the interrupt. This should be encoded based on + the information in section 2) depending on the type of interrupt + controller you have. + - interrupt-parent : the phandle for the interrupt controller that + services interrupts for this device. + - reg : The ID number for the phy, usually a small integer + - linux,phandle : phandle for this node; likely referenced by an + ethernet controller node. + +Example: + +ethernet-phy@0 { + linux,phandle = <2452000> + interrupt-parent = <40000>; + interrupts = <35 1>; + reg = <0>; + device_type = "ethernet-phy"; +}; diff --git a/Documentation/devicetree/bindings/pci/83xx-512x-pci.txt b/Documentation/devicetree/bindings/pci/83xx-512x-pci.txt new file mode 100644 index 000000000000..35a465362408 --- /dev/null +++ b/Documentation/devicetree/bindings/pci/83xx-512x-pci.txt @@ -0,0 +1,40 @@ +* Freescale 83xx and 512x PCI bridges + +Freescale 83xx and 512x SOCs include the same pci bridge core. + +83xx/512x specific notes: +- reg: should contain two address length tuples + The first is for the internal pci bridge registers + The second is for the pci config space access registers + +Example (MPC8313ERDB) + pci0: pci@e0008500 { + cell-index = <1>; + interrupt-map-mask = <0xf800 0x0 0x0 0x7>; + interrupt-map = < + /* IDSEL 0x0E -mini PCI */ + 0x7000 0x0 0x0 0x1 &ipic 18 0x8 + 0x7000 0x0 0x0 0x2 &ipic 18 0x8 + 0x7000 0x0 0x0 0x3 &ipic 18 0x8 + 0x7000 0x0 0x0 0x4 &ipic 18 0x8 + + /* IDSEL 0x0F - PCI slot */ + 0x7800 0x0 0x0 0x1 &ipic 17 0x8 + 0x7800 0x0 0x0 0x2 &ipic 18 0x8 + 0x7800 0x0 0x0 0x3 &ipic 17 0x8 + 0x7800 0x0 0x0 0x4 &ipic 18 0x8>; + interrupt-parent = <&ipic>; + interrupts = <66 0x8>; + bus-range = <0x0 0x0>; + ranges = <0x02000000 0x0 0x90000000 0x90000000 0x0 0x10000000 + 0x42000000 0x0 0x80000000 0x80000000 0x0 0x10000000 + 0x01000000 0x0 0x00000000 0xe2000000 0x0 0x00100000>; + clock-frequency = <66666666>; + #interrupt-cells = <1>; + #size-cells = <2>; + #address-cells = <3>; + reg = <0xe0008500 0x100 /* internal registers */ + 0xe0008300 0x8>; /* config space access registers */ + compatible = "fsl,mpc8349-pci"; + device_type = "pci"; + }; diff --git a/Documentation/devicetree/bindings/powerpc/4xx/cpm.txt b/Documentation/devicetree/bindings/powerpc/4xx/cpm.txt new file mode 100644 index 000000000000..ee459806d35e --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/4xx/cpm.txt @@ -0,0 +1,52 @@ +PPC4xx Clock Power Management (CPM) node + +Required properties: + - compatible : compatible list, currently only "ibm,cpm" + - dcr-access-method : "native" + - dcr-reg : < DCR register range > + +Optional properties: + - er-offset : All 4xx SoCs with a CPM controller have + one of two different order for the CPM + registers. Some have the CPM registers + in the following order (ER,FR,SR). The + others have them in the following order + (SR,ER,FR). For the second case set + er-offset = <1>. + - unused-units : specifier consist of one cell. For each + bit in the cell, the corresponding bit + in CPM will be set to turn off unused + devices. + - idle-doze : specifier consist of one cell. For each + bit in the cell, the corresponding bit + in CPM will be set to turn off unused + devices. This is usually just CPM[CPU]. + - standby : specifier consist of one cell. For each + bit in the cell, the corresponding bit + in CPM will be set on standby and + restored on resume. + - suspend : specifier consist of one cell. For each + bit in the cell, the corresponding bit + in CPM will be set on suspend (mem) and + restored on resume. Note, for standby + and suspend the corresponding bits can + be different or the same. Usually for + standby only class 2 and 3 units are set. + However, the interface does not care. + If they are the same, the additional + power saving will be seeing if support + is available to put the DDR in self + refresh mode and any additional power + saving techniques for the specific SoC. + +Example: + CPM0: cpm { + compatible = "ibm,cpm"; + dcr-access-method = "native"; + dcr-reg = <0x160 0x003>; + er-offset = <0>; + unused-units = <0x00000100>; + idle-doze = <0x02000000>; + standby = <0xfeff0000>; + suspend = <0xfeff791d>; +}; diff --git a/Documentation/devicetree/bindings/powerpc/4xx/emac.txt b/Documentation/devicetree/bindings/powerpc/4xx/emac.txt new file mode 100644 index 000000000000..2161334a7ca5 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/4xx/emac.txt @@ -0,0 +1,148 @@ + 4xx/Axon EMAC ethernet nodes + + The EMAC ethernet controller in IBM and AMCC 4xx chips, and also + the Axon bridge. To operate this needs to interact with a ths + special McMAL DMA controller, and sometimes an RGMII or ZMII + interface. In addition to the nodes and properties described + below, the node for the OPB bus on which the EMAC sits must have a + correct clock-frequency property. + + i) The EMAC node itself + + Required properties: + - device_type : "network" + + - compatible : compatible list, contains 2 entries, first is + "ibm,emac-CHIP" where CHIP is the host ASIC (440gx, + 405gp, Axon) and second is either "ibm,emac" or + "ibm,emac4". For Axon, thus, we have: "ibm,emac-axon", + "ibm,emac4" + - interrupts : + - interrupt-parent : optional, if needed for interrupt mapping + - reg : + - local-mac-address : 6 bytes, MAC address + - mal-device : phandle of the associated McMAL node + - mal-tx-channel : 1 cell, index of the tx channel on McMAL associated + with this EMAC + - mal-rx-channel : 1 cell, index of the rx channel on McMAL associated + with this EMAC + - cell-index : 1 cell, hardware index of the EMAC cell on a given + ASIC (typically 0x0 and 0x1 for EMAC0 and EMAC1 on + each Axon chip) + - max-frame-size : 1 cell, maximum frame size supported in bytes + - rx-fifo-size : 1 cell, Rx fifo size in bytes for 10 and 100 Mb/sec + operations. + For Axon, 2048 + - tx-fifo-size : 1 cell, Tx fifo size in bytes for 10 and 100 Mb/sec + operations. + For Axon, 2048. + - fifo-entry-size : 1 cell, size of a fifo entry (used to calculate + thresholds). + For Axon, 0x00000010 + - mal-burst-size : 1 cell, MAL burst size (used to calculate thresholds) + in bytes. + For Axon, 0x00000100 (I think ...) + - phy-mode : string, mode of operations of the PHY interface. + Supported values are: "mii", "rmii", "smii", "rgmii", + "tbi", "gmii", rtbi", "sgmii". + For Axon on CAB, it is "rgmii" + - mdio-device : 1 cell, required iff using shared MDIO registers + (440EP). phandle of the EMAC to use to drive the + MDIO lines for the PHY used by this EMAC. + - zmii-device : 1 cell, required iff connected to a ZMII. phandle of + the ZMII device node + - zmii-channel : 1 cell, required iff connected to a ZMII. Which ZMII + channel or 0xffffffff if ZMII is only used for MDIO. + - rgmii-device : 1 cell, required iff connected to an RGMII. phandle + of the RGMII device node. + For Axon: phandle of plb5/plb4/opb/rgmii + - rgmii-channel : 1 cell, required iff connected to an RGMII. Which + RGMII channel is used by this EMAC. + Fox Axon: present, whatever value is appropriate for each + EMAC, that is the content of the current (bogus) "phy-port" + property. + + Optional properties: + - phy-address : 1 cell, optional, MDIO address of the PHY. If absent, + a search is performed. + - phy-map : 1 cell, optional, bitmap of addresses to probe the PHY + for, used if phy-address is absent. bit 0x00000001 is + MDIO address 0. + For Axon it can be absent, though my current driver + doesn't handle phy-address yet so for now, keep + 0x00ffffff in it. + - rx-fifo-size-gige : 1 cell, Rx fifo size in bytes for 1000 Mb/sec + operations (if absent the value is the same as + rx-fifo-size). For Axon, either absent or 2048. + - tx-fifo-size-gige : 1 cell, Tx fifo size in bytes for 1000 Mb/sec + operations (if absent the value is the same as + tx-fifo-size). For Axon, either absent or 2048. + - tah-device : 1 cell, optional. If connected to a TAH engine for + offload, phandle of the TAH device node. + - tah-channel : 1 cell, optional. If appropriate, channel used on the + TAH engine. + + Example: + + EMAC0: ethernet@40000800 { + device_type = "network"; + compatible = "ibm,emac-440gp", "ibm,emac"; + interrupt-parent = <&UIC1>; + interrupts = <1c 4 1d 4>; + reg = <40000800 70>; + local-mac-address = [00 04 AC E3 1B 1E]; + mal-device = <&MAL0>; + mal-tx-channel = <0 1>; + mal-rx-channel = <0>; + cell-index = <0>; + max-frame-size = <5dc>; + rx-fifo-size = <1000>; + tx-fifo-size = <800>; + phy-mode = "rmii"; + phy-map = <00000001>; + zmii-device = <&ZMII0>; + zmii-channel = <0>; + }; + + ii) McMAL node + + Required properties: + - device_type : "dma-controller" + - compatible : compatible list, containing 2 entries, first is + "ibm,mcmal-CHIP" where CHIP is the host ASIC (like + emac) and the second is either "ibm,mcmal" or + "ibm,mcmal2". + For Axon, "ibm,mcmal-axon","ibm,mcmal2" + - interrupts : . + For Axon: This is _different_ from the current + firmware. We use the "delayed" interrupts for txeob + and rxeob. Thus we end up with mapping those 5 MPIC + interrupts, all level positive sensitive: 10, 11, 32, + 33, 34 (in decimal) + - dcr-reg : < DCR registers range > + - dcr-parent : if needed for dcr-reg + - num-tx-chans : 1 cell, number of Tx channels + - num-rx-chans : 1 cell, number of Rx channels + + iii) ZMII node + + Required properties: + - compatible : compatible list, containing 2 entries, first is + "ibm,zmii-CHIP" where CHIP is the host ASIC (like + EMAC) and the second is "ibm,zmii". + For Axon, there is no ZMII node. + - reg : + + iv) RGMII node + + Required properties: + - compatible : compatible list, containing 2 entries, first is + "ibm,rgmii-CHIP" where CHIP is the host ASIC (like + EMAC) and the second is "ibm,rgmii". + For Axon, "ibm,rgmii-axon","ibm,rgmii" + - reg : + - revision : as provided by the RGMII new version register if + available. + For Axon: 0x0000012a + diff --git a/Documentation/devicetree/bindings/powerpc/4xx/ndfc.txt b/Documentation/devicetree/bindings/powerpc/4xx/ndfc.txt new file mode 100644 index 000000000000..869f0b5f16e8 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/4xx/ndfc.txt @@ -0,0 +1,39 @@ +AMCC NDFC (NanD Flash Controller) + +Required properties: +- compatible : "ibm,ndfc". +- reg : should specify chip select and size used for the chip (0x2000). + +Optional properties: +- ccr : NDFC config and control register value (default 0). +- bank-settings : NDFC bank configuration register value (default 0). + +Notes: +- partition(s) - follows the OF MTD standard for partitions + +Example: + +ndfc@1,0 { + compatible = "ibm,ndfc"; + reg = <0x00000001 0x00000000 0x00002000>; + ccr = <0x00001000>; + bank-settings = <0x80002222>; + #address-cells = <1>; + #size-cells = <1>; + + nand { + #address-cells = <1>; + #size-cells = <1>; + + partition@0 { + label = "kernel"; + reg = <0x00000000 0x00200000>; + }; + partition@200000 { + label = "root"; + reg = <0x00200000 0x03E00000>; + }; + }; +}; + + diff --git a/Documentation/devicetree/bindings/powerpc/4xx/ppc440spe-adma.txt b/Documentation/devicetree/bindings/powerpc/4xx/ppc440spe-adma.txt new file mode 100644 index 000000000000..515ebcf1b97d --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/4xx/ppc440spe-adma.txt @@ -0,0 +1,93 @@ +PPC440SPe DMA/XOR (DMA Controller and XOR Accelerator) + +Device nodes needed for operation of the ppc440spe-adma driver +are specified hereby. These are I2O/DMA, DMA and XOR nodes +for DMA engines and Memory Queue Module node. The latter is used +by ADMA driver for configuration of RAID-6 H/W capabilities of +the PPC440SPe. In addition to the nodes and properties described +below, the ranges property of PLB node must specify ranges for +DMA devices. + + i) The I2O node + + Required properties: + + - compatible : "ibm,i2o-440spe"; + - reg : + - dcr-reg : + + Example: + + I2O: i2o@400100000 { + compatible = "ibm,i2o-440spe"; + reg = <0x00000004 0x00100000 0x100>; + dcr-reg = <0x060 0x020>; + }; + + + ii) The DMA node + + Required properties: + + - compatible : "ibm,dma-440spe"; + - cell-index : 1 cell, hardware index of the DMA engine + (typically 0x0 and 0x1 for DMA0 and DMA1) + - reg : + - dcr-reg : + - interrupts : . + - interrupt-parent : needed for interrupt mapping + + Example: + + DMA0: dma0@400100100 { + compatible = "ibm,dma-440spe"; + cell-index = <0>; + reg = <0x00000004 0x00100100 0x100>; + dcr-reg = <0x060 0x020>; + interrupt-parent = <&DMA0>; + interrupts = <0 1>; + #interrupt-cells = <1>; + #address-cells = <0>; + #size-cells = <0>; + interrupt-map = < + 0 &UIC0 0x14 4 + 1 &UIC1 0x16 4>; + }; + + + iii) XOR Accelerator node + + Required properties: + + - compatible : "amcc,xor-accelerator"; + - reg : + - interrupts : + - interrupt-parent : for interrupt mapping + + Example: + + xor-accel@400200000 { + compatible = "amcc,xor-accelerator"; + reg = <0x00000004 0x00200000 0x400>; + interrupt-parent = <&UIC1>; + interrupts = <0x1f 4>; + }; + + + iv) Memory Queue Module node + + Required properties: + + - compatible : "ibm,mq-440spe"; + - dcr-reg : + + Example: + + MQ0: mq { + compatible = "ibm,mq-440spe"; + dcr-reg = <0x040 0x020>; + }; + diff --git a/Documentation/devicetree/bindings/powerpc/4xx/reboot.txt b/Documentation/devicetree/bindings/powerpc/4xx/reboot.txt new file mode 100644 index 000000000000..d7217260589c --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/4xx/reboot.txt @@ -0,0 +1,18 @@ +Reboot property to control system reboot on PPC4xx systems: + +By setting "reset_type" to one of the following values, the default +software reset mechanism may be overidden. Here the possible values of +"reset_type": + + 1 - PPC4xx core reset + 2 - PPC4xx chip reset + 3 - PPC4xx system reset (default) + +Example: + + cpu@0 { + device_type = "cpu"; + model = "PowerPC,440SPe"; + ... + reset-type = <2>; /* Use chip-reset */ + }; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/board.txt b/Documentation/devicetree/bindings/powerpc/fsl/board.txt new file mode 100644 index 000000000000..39e941515a36 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/board.txt @@ -0,0 +1,63 @@ +* Board Control and Status (BCSR) + +Required properties: + + - compatible : Should be "fsl,-bcsr" + - reg : Offset and length of the register set for the device + +Example: + + bcsr@f8000000 { + compatible = "fsl,mpc8360mds-bcsr"; + reg = ; + }; + +* Freescale on board FPGA + +This is the memory-mapped registers for on board FPGA. + +Required properities: +- compatible : should be "fsl,fpga-pixis". +- reg : should contain the address and the length of the FPPGA register + set. +- interrupt-parent: should specify phandle for the interrupt controller. +- interrupts : should specify event (wakeup) IRQ. + +Example (MPC8610HPCD): + + board-control@e8000000 { + compatible = "fsl,fpga-pixis"; + reg = <0xe8000000 32>; + interrupt-parent = <&mpic>; + interrupts = <8 8>; + }; + +* Freescale BCSR GPIO banks + +Some BCSR registers act as simple GPIO controllers, each such +register can be represented by the gpio-controller node. + +Required properities: +- compatible : Should be "fsl,-bcsr-gpio". +- reg : Should contain the address and the length of the GPIO bank + register. +- #gpio-cells : Should be two. The first cell is the pin number and the + second cell is used to specify optional parameters (currently unused). +- gpio-controller : Marks the port as GPIO controller. + +Example: + + bcsr@1,0 { + #address-cells = <1>; + #size-cells = <1>; + compatible = "fsl,mpc8360mds-bcsr"; + reg = <1 0 0x8000>; + ranges = <0 1 0 0x8000>; + + bcsr13: gpio-controller@d { + #gpio-cells = <2>; + compatible = "fsl,mpc8360mds-bcsr-gpio"; + reg = <0xd 1>; + gpio-controller; + }; + }; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm.txt b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm.txt new file mode 100644 index 000000000000..160c752484b4 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm.txt @@ -0,0 +1,67 @@ +* Freescale Communications Processor Module + +NOTE: This is an interim binding, and will likely change slightly, +as more devices are supported. The QE bindings especially are +incomplete. + +* Root CPM node + +Properties: +- compatible : "fsl,cpm1", "fsl,cpm2", or "fsl,qe". +- reg : A 48-byte region beginning with CPCR. + +Example: + cpm@119c0 { + #address-cells = <1>; + #size-cells = <1>; + #interrupt-cells = <2>; + compatible = "fsl,mpc8272-cpm", "fsl,cpm2"; + reg = <119c0 30>; + } + +* Properties common to multiple CPM/QE devices + +- fsl,cpm-command : This value is ORed with the opcode and command flag + to specify the device on which a CPM command operates. + +- fsl,cpm-brg : Indicates which baud rate generator the device + is associated with. If absent, an unused BRG + should be dynamically allocated. If zero, the + device uses an external clock rather than a BRG. + +- reg : Unless otherwise specified, the first resource represents the + scc/fcc/ucc registers, and the second represents the device's + parameter RAM region (if it has one). + +* Multi-User RAM (MURAM) + +The multi-user/dual-ported RAM is expressed as a bus under the CPM node. + +Ranges must be set up subject to the following restrictions: + +- Children's reg nodes must be offsets from the start of all muram, even + if the user-data area does not begin at zero. +- If multiple range entries are used, the difference between the parent + address and the child address must be the same in all, so that a single + mapping can cover them all while maintaining the ability to determine + CPM-side offsets with pointer subtraction. It is recommended that + multiple range entries not be used. +- A child address of zero must be translatable, even if no reg resources + contain it. + +A child "data" node must exist, compatible with "fsl,cpm-muram-data", to +indicate the portion of muram that is usable by the OS for arbitrary +purposes. The data node may have an arbitrary number of reg resources, +all of which contribute to the allocatable muram pool. + +Example, based on mpc8272: + muram@0 { + #address-cells = <1>; + #size-cells = <1>; + ranges = <0 0 10000>; + + data@0 { + compatible = "fsl,cpm-muram-data"; + reg = <0 2000 9800 800>; + }; + }; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/brg.txt b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/brg.txt new file mode 100644 index 000000000000..4c7d45eaf025 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/brg.txt @@ -0,0 +1,21 @@ +* Baud Rate Generators + +Currently defined compatibles: +fsl,cpm-brg +fsl,cpm1-brg +fsl,cpm2-brg + +Properties: +- reg : There may be an arbitrary number of reg resources; BRG + numbers are assigned to these in order. +- clock-frequency : Specifies the base frequency driving + the BRG. + +Example: + brg@119f0 { + compatible = "fsl,mpc8272-brg", + "fsl,cpm2-brg", + "fsl,cpm-brg"; + reg = <119f0 10 115f0 10>; + clock-frequency = ; + }; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/i2c.txt b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/i2c.txt new file mode 100644 index 000000000000..87bc6048667e --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/i2c.txt @@ -0,0 +1,41 @@ +* I2C + +The I2C controller is expressed as a bus under the CPM node. + +Properties: +- compatible : "fsl,cpm1-i2c", "fsl,cpm2-i2c" +- reg : On CPM2 devices, the second resource doesn't specify the I2C + Parameter RAM itself, but the I2C_BASE field of the CPM2 Parameter RAM + (typically 0x8afc 0x2). +- #address-cells : Should be one. The cell is the i2c device address with + the r/w bit set to zero. +- #size-cells : Should be zero. +- clock-frequency : Can be used to set the i2c clock frequency. If + unspecified, a default frequency of 60kHz is being used. +The following two properties are deprecated. They are only used by legacy +i2c drivers to find the bus to probe: +- linux,i2c-index : Can be used to hard code an i2c bus number. By default, + the bus number is dynamically assigned by the i2c core. +- linux,i2c-class : Can be used to override the i2c class. The class is used + by legacy i2c device drivers to find a bus in a specific context like + system management, video or sound. By default, I2C_CLASS_HWMON (1) is + being used. The definition of the classes can be found in + include/i2c/i2c.h + +Example, based on mpc823: + + i2c@860 { + compatible = "fsl,mpc823-i2c", + "fsl,cpm1-i2c"; + reg = <0x860 0x20 0x3c80 0x30>; + interrupts = <16>; + interrupt-parent = <&CPM_PIC>; + fsl,cpm-command = <0x10>; + #address-cells = <1>; + #size-cells = <0>; + + rtc@68 { + compatible = "dallas,ds1307"; + reg = <0x68>; + }; + }; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/pic.txt b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/pic.txt new file mode 100644 index 000000000000..8e3ee1681618 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/pic.txt @@ -0,0 +1,18 @@ +* Interrupt Controllers + +Currently defined compatibles: +- fsl,cpm1-pic + - only one interrupt cell +- fsl,pq1-pic +- fsl,cpm2-pic + - second interrupt cell is level/sense: + - 2 is falling edge + - 8 is active low + +Example: + interrupt-controller@10c00 { + #interrupt-cells = <2>; + interrupt-controller; + reg = <10c00 80>; + compatible = "mpc8272-pic", "fsl,cpm2-pic"; + }; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/usb.txt b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/usb.txt new file mode 100644 index 000000000000..74bfda4bb824 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/cpm/usb.txt @@ -0,0 +1,15 @@ +* USB (Universal Serial Bus Controller) + +Properties: +- compatible : "fsl,cpm1-usb", "fsl,cpm2-usb", "fsl,qe-usb" + +Example: + usb@11bc0 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "fsl,cpm2-usb"; + reg = <11b60 18 8b00 100>; + interrupts = ; + interrupt-parent = <&PIC>; + fsl,cpm-command = <2e600000>; + }; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/gpio.txt b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/gpio.txt new file mode 100644 index 000000000000..349f79fd7076 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/gpio.txt @@ -0,0 +1,38 @@ +Every GPIO controller node must have #gpio-cells property defined, +this information will be used to translate gpio-specifiers. + +On CPM1 devices, all ports are using slightly different register layouts. +Ports A, C and D are 16bit ports and Ports B and E are 32bit ports. + +On CPM2 devices, all ports are 32bit ports and use a common register layout. + +Required properties: +- compatible : "fsl,cpm1-pario-bank-a", "fsl,cpm1-pario-bank-b", + "fsl,cpm1-pario-bank-c", "fsl,cpm1-pario-bank-d", + "fsl,cpm1-pario-bank-e", "fsl,cpm2-pario-bank" +- #gpio-cells : Should be two. The first cell is the pin number and the + second cell is used to specify optional parameters (currently unused). +- gpio-controller : Marks the port as GPIO controller. + +Example of three SOC GPIO banks defined as gpio-controller nodes: + + CPM1_PIO_A: gpio-controller@950 { + #gpio-cells = <2>; + compatible = "fsl,cpm1-pario-bank-a"; + reg = <0x950 0x10>; + gpio-controller; + }; + + CPM1_PIO_B: gpio-controller@ab8 { + #gpio-cells = <2>; + compatible = "fsl,cpm1-pario-bank-b"; + reg = <0xab8 0x10>; + gpio-controller; + }; + + CPM1_PIO_E: gpio-controller@ac8 { + #gpio-cells = <2>; + compatible = "fsl,cpm1-pario-bank-e"; + reg = <0xac8 0x18>; + gpio-controller; + }; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/network.txt b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/network.txt new file mode 100644 index 000000000000..0e4269446580 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/network.txt @@ -0,0 +1,45 @@ +* Network + +Currently defined compatibles: +- fsl,cpm1-scc-enet +- fsl,cpm2-scc-enet +- fsl,cpm1-fec-enet +- fsl,cpm2-fcc-enet (third resource is GFEMR) +- fsl,qe-enet + +Example: + + ethernet@11300 { + device_type = "network"; + compatible = "fsl,mpc8272-fcc-enet", + "fsl,cpm2-fcc-enet"; + reg = <11300 20 8400 100 11390 1>; + local-mac-address = [ 00 00 00 00 00 00 ]; + interrupts = <20 8>; + interrupt-parent = <&PIC>; + phy-handle = <&PHY0>; + fsl,cpm-command = <12000300>; + }; + +* MDIO + +Currently defined compatibles: +fsl,pq1-fec-mdio (reg is same as first resource of FEC device) +fsl,cpm2-mdio-bitbang (reg is port C registers) + +Properties for fsl,cpm2-mdio-bitbang: +fsl,mdio-pin : pin of port C controlling mdio data +fsl,mdc-pin : pin of port C controlling mdio clock + +Example: + mdio@10d40 { + device_type = "mdio"; + compatible = "fsl,mpc8272ads-mdio-bitbang", + "fsl,mpc8272-mdio-bitbang", + "fsl,cpm2-mdio-bitbang"; + reg = <10d40 14>; + #address-cells = <1>; + #size-cells = <0>; + fsl,mdio-pin = <12>; + fsl,mdc-pin = <13>; + }; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt new file mode 100644 index 000000000000..4f8930263dd9 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe.txt @@ -0,0 +1,115 @@ +* Freescale QUICC Engine module (QE) +This represents qe module that is installed on PowerQUICC II Pro. + +NOTE: This is an interim binding; it should be updated to fit +in with the CPM binding later in this document. + +Basically, it is a bus of devices, that could act more or less +as a complete entity (UCC, USB etc ). All of them should be siblings on +the "root" qe node, using the common properties from there. +The description below applies to the qe of MPC8360 and +more nodes and properties would be extended in the future. + +i) Root QE device + +Required properties: +- compatible : should be "fsl,qe"; +- model : precise model of the QE, Can be "QE", "CPM", or "CPM2" +- reg : offset and length of the device registers. +- bus-frequency : the clock frequency for QUICC Engine. +- fsl,qe-num-riscs: define how many RISC engines the QE has. +- fsl,qe-num-snums: define how many serial number(SNUM) the QE can use for the + threads. + +Optional properties: +- fsl,firmware-phandle: + Usage: required only if there is no fsl,qe-firmware child node + Value type: + Definition: Points to a firmware node (see "QE Firmware Node" below) + that contains the firmware that should be uploaded for this QE. + The compatible property for the firmware node should say, + "fsl,qe-firmware". + +Recommended properties +- brg-frequency : the internal clock source frequency for baud-rate + generators in Hz. + +Example: + qe@e0100000 { + #address-cells = <1>; + #size-cells = <1>; + #interrupt-cells = <2>; + compatible = "fsl,qe"; + ranges = <0 e0100000 00100000>; + reg = ; + brg-frequency = <0>; + bus-frequency = <179A7B00>; + } + +* Multi-User RAM (MURAM) + +Required properties: +- compatible : should be "fsl,qe-muram", "fsl,cpm-muram". +- mode : the could be "host" or "slave". +- ranges : Should be defined as specified in 1) to describe the + translation of MURAM addresses. +- data-only : sub-node which defines the address area under MURAM + bus that can be allocated as data/parameter + +Example: + + muram@10000 { + compatible = "fsl,qe-muram", "fsl,cpm-muram"; + ranges = <0 00010000 0000c000>; + + data-only@0{ + compatible = "fsl,qe-muram-data", + "fsl,cpm-muram-data"; + reg = <0 c000>; + }; + }; + +* QE Firmware Node + +This node defines a firmware binary that is embedded in the device tree, for +the purpose of passing the firmware from bootloader to the kernel, or from +the hypervisor to the guest. + +The firmware node itself contains the firmware binary contents, a compatible +property, and any firmware-specific properties. The node should be placed +inside a QE node that needs it. Doing so eliminates the need for a +fsl,firmware-phandle property. Other QE nodes that need the same firmware +should define an fsl,firmware-phandle property that points to the firmware node +in the first QE node. + +The fsl,firmware property can be specified in the DTS (possibly using incbin) +or can be inserted by the boot loader at boot time. + +Required properties: + - compatible + Usage: required + Value type: + Definition: A standard property. Specify a string that indicates what + kind of firmware it is. For QE, this should be "fsl,qe-firmware". + + - fsl,firmware + Usage: required + Value type: , encoded as an array of bytes + Definition: A standard property. This property contains the firmware + binary "blob". + +Example: + qe1@e0080000 { + compatible = "fsl,qe"; + qe_firmware:qe-firmware { + compatible = "fsl,qe-firmware"; + fsl,firmware = [0x70 0xcd 0x00 0x00 0x01 0x46 0x45 ...]; + }; + ... + }; + + qe2@e0090000 { + compatible = "fsl,qe"; + fsl,firmware-phandle = <&qe_firmware>; + ... + }; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe/firmware.txt b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe/firmware.txt new file mode 100644 index 000000000000..249db3a15d15 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe/firmware.txt @@ -0,0 +1,24 @@ +* Uploaded QE firmware + + If a new firmware has been uploaded to the QE (usually by the + boot loader), then a 'firmware' child node should be added to the QE + node. This node provides information on the uploaded firmware that + device drivers may need. + + Required properties: + - id: The string name of the firmware. This is taken from the 'id' + member of the qe_firmware structure of the uploaded firmware. + Device drivers can search this string to determine if the + firmware they want is already present. + - extended-modes: The Extended Modes bitfield, taken from the + firmware binary. It is a 64-bit number represented + as an array of two 32-bit numbers. + - virtual-traps: The virtual traps, taken from the firmware binary. + It is an array of 8 32-bit numbers. + +Example: + firmware { + id = "Soft-UART"; + extended-modes = <0 0>; + virtual-traps = <0 0 0 0 0 0 0 0>; + }; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe/par_io.txt b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe/par_io.txt new file mode 100644 index 000000000000..60984260207b --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe/par_io.txt @@ -0,0 +1,51 @@ +* Parallel I/O Ports + +This node configures Parallel I/O ports for CPUs with QE support. +The node should reside in the "soc" node of the tree. For each +device that using parallel I/O ports, a child node should be created. +See the definition of the Pin configuration nodes below for more +information. + +Required properties: +- device_type : should be "par_io". +- reg : offset to the register set and its length. +- num-ports : number of Parallel I/O ports + +Example: +par_io@1400 { + reg = <1400 100>; + #address-cells = <1>; + #size-cells = <0>; + device_type = "par_io"; + num-ports = <7>; + ucc_pin@01 { + ...... + }; + +Note that "par_io" nodes are obsolete, and should not be used for +the new device trees. Instead, each Par I/O bank should be represented +via its own gpio-controller node: + +Required properties: +- #gpio-cells : should be "2". +- compatible : should be "fsl,-qe-pario-bank", + "fsl,mpc8323-qe-pario-bank". +- reg : offset to the register set and its length. +- gpio-controller : node to identify gpio controllers. + +Example: + qe_pio_a: gpio-controller@1400 { + #gpio-cells = <2>; + compatible = "fsl,mpc8360-qe-pario-bank", + "fsl,mpc8323-qe-pario-bank"; + reg = <0x1400 0x18>; + gpio-controller; + }; + + qe_pio_e: gpio-controller@1460 { + #gpio-cells = <2>; + compatible = "fsl,mpc8360-qe-pario-bank", + "fsl,mpc8323-qe-pario-bank"; + reg = <0x1460 0x18>; + gpio-controller; + }; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe/pincfg.txt b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe/pincfg.txt new file mode 100644 index 000000000000..c5b43061db3a --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe/pincfg.txt @@ -0,0 +1,60 @@ +* Pin configuration nodes + +Required properties: +- linux,phandle : phandle of this node; likely referenced by a QE + device. +- pio-map : array of pin configurations. Each pin is defined by 6 + integers. The six numbers are respectively: port, pin, dir, + open_drain, assignment, has_irq. + - port : port number of the pin; 0-6 represent port A-G in UM. + - pin : pin number in the port. + - dir : direction of the pin, should encode as follows: + + 0 = The pin is disabled + 1 = The pin is an output + 2 = The pin is an input + 3 = The pin is I/O + + - open_drain : indicates the pin is normal or wired-OR: + + 0 = The pin is actively driven as an output + 1 = The pin is an open-drain driver. As an output, the pin is + driven active-low, otherwise it is three-stated. + + - assignment : function number of the pin according to the Pin Assignment + tables in User Manual. Each pin can have up to 4 possible functions in + QE and two options for CPM. + - has_irq : indicates if the pin is used as source of external + interrupts. + +Example: + ucc_pin@01 { + linux,phandle = <140001>; + pio-map = < + /* port pin dir open_drain assignment has_irq */ + 0 3 1 0 1 0 /* TxD0 */ + 0 4 1 0 1 0 /* TxD1 */ + 0 5 1 0 1 0 /* TxD2 */ + 0 6 1 0 1 0 /* TxD3 */ + 1 6 1 0 3 0 /* TxD4 */ + 1 7 1 0 1 0 /* TxD5 */ + 1 9 1 0 2 0 /* TxD6 */ + 1 a 1 0 2 0 /* TxD7 */ + 0 9 2 0 1 0 /* RxD0 */ + 0 a 2 0 1 0 /* RxD1 */ + 0 b 2 0 1 0 /* RxD2 */ + 0 c 2 0 1 0 /* RxD3 */ + 0 d 2 0 1 0 /* RxD4 */ + 1 1 2 0 2 0 /* RxD5 */ + 1 0 2 0 2 0 /* RxD6 */ + 1 4 2 0 2 0 /* RxD7 */ + 0 7 1 0 1 0 /* TX_EN */ + 0 8 1 0 1 0 /* TX_ER */ + 0 f 2 0 1 0 /* RX_DV */ + 0 10 2 0 1 0 /* RX_ER */ + 0 0 2 0 1 0 /* RX_CLK */ + 2 9 1 0 3 0 /* GTX_CLK - CLK10 */ + 2 8 2 0 1 0>; /* GTX125 - CLK9 */ + }; + + diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe/ucc.txt b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe/ucc.txt new file mode 100644 index 000000000000..e47734bee3f0 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe/ucc.txt @@ -0,0 +1,70 @@ +* UCC (Unified Communications Controllers) + +Required properties: +- device_type : should be "network", "hldc", "uart", "transparent" + "bisync", "atm", or "serial". +- compatible : could be "ucc_geth" or "fsl_atm" and so on. +- cell-index : the ucc number(1-8), corresponding to UCCx in UM. +- reg : Offset and length of the register set for the device +- interrupts : where a is the interrupt number and b is a + field that represents an encoding of the sense and level + information for the interrupt. This should be encoded based on + the information in section 2) depending on the type of interrupt + controller you have. +- interrupt-parent : the phandle for the interrupt controller that + services interrupts for this device. +- pio-handle : The phandle for the Parallel I/O port configuration. +- port-number : for UART drivers, the port number to use, between 0 and 3. + This usually corresponds to the /dev/ttyQE device, e.g. <0> = /dev/ttyQE0. + The port number is added to the minor number of the device. Unlike the + CPM UART driver, the port-number is required for the QE UART driver. +- soft-uart : for UART drivers, if specified this means the QE UART device + driver should use "Soft-UART" mode, which is needed on some SOCs that have + broken UART hardware. Soft-UART is provided via a microcode upload. +- rx-clock-name: the UCC receive clock source + "none": clock source is disabled + "brg1" through "brg16": clock source is BRG1-BRG16, respectively + "clk1" through "clk24": clock source is CLK1-CLK24, respectively +- tx-clock-name: the UCC transmit clock source + "none": clock source is disabled + "brg1" through "brg16": clock source is BRG1-BRG16, respectively + "clk1" through "clk24": clock source is CLK1-CLK24, respectively +The following two properties are deprecated. rx-clock has been replaced +with rx-clock-name, and tx-clock has been replaced with tx-clock-name. +Drivers that currently use the deprecated properties should continue to +do so, in order to support older device trees, but they should be updated +to check for the new properties first. +- rx-clock : represents the UCC receive clock source. + 0x00 : clock source is disabled; + 0x1~0x10 : clock source is BRG1~BRG16 respectively; + 0x11~0x28: clock source is QE_CLK1~QE_CLK24 respectively. +- tx-clock: represents the UCC transmit clock source; + 0x00 : clock source is disabled; + 0x1~0x10 : clock source is BRG1~BRG16 respectively; + 0x11~0x28: clock source is QE_CLK1~QE_CLK24 respectively. + +Required properties for network device_type: +- mac-address : list of bytes representing the ethernet address. +- phy-handle : The phandle for the PHY connected to this controller. + +Recommended properties: +- phy-connection-type : a string naming the controller/PHY interface type, + i.e., "mii" (default), "rmii", "gmii", "rgmii", "rgmii-id" (Internal + Delay), "rgmii-txid" (delay on TX only), "rgmii-rxid" (delay on RX only), + "tbi", or "rtbi". + +Example: + ucc@2000 { + device_type = "network"; + compatible = "ucc_geth"; + cell-index = <1>; + reg = <2000 200>; + interrupts = ; + interrupt-parent = <700>; + mac-address = [ 00 04 9f 00 23 23 ]; + rx-clock = "none"; + tx-clock = "clk9"; + phy-handle = <212000>; + phy-connection-type = "gmii"; + pio-handle = <140001>; + }; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe/usb.txt b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe/usb.txt new file mode 100644 index 000000000000..9ccd5f30405b --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/qe/usb.txt @@ -0,0 +1,37 @@ +Freescale QUICC Engine USB Controller + +Required properties: +- compatible : should be "fsl,-qe-usb", "fsl,mpc8323-qe-usb". +- reg : the first two cells should contain usb registers location and + length, the next two two cells should contain PRAM location and + length. +- interrupts : should contain USB interrupt. +- interrupt-parent : interrupt source phandle. +- fsl,fullspeed-clock : specifies the full speed USB clock source: + "none": clock source is disabled + "brg1" through "brg16": clock source is BRG1-BRG16, respectively + "clk1" through "clk24": clock source is CLK1-CLK24, respectively +- fsl,lowspeed-clock : specifies the low speed USB clock source: + "none": clock source is disabled + "brg1" through "brg16": clock source is BRG1-BRG16, respectively + "clk1" through "clk24": clock source is CLK1-CLK24, respectively +- hub-power-budget : USB power budget for the root hub, in mA. +- gpios : should specify GPIOs in this order: USBOE, USBTP, USBTN, USBRP, + USBRN, SPEED (optional), and POWER (optional). + +Example: + +usb@6c0 { + compatible = "fsl,mpc8360-qe-usb", "fsl,mpc8323-qe-usb"; + reg = <0x6c0 0x40 0x8b00 0x100>; + interrupts = <11>; + interrupt-parent = <&qeic>; + fsl,fullspeed-clock = "clk21"; + gpios = <&qe_pio_b 2 0 /* USBOE */ + &qe_pio_b 3 0 /* USBTP */ + &qe_pio_b 8 0 /* USBTN */ + &qe_pio_b 9 0 /* USBRP */ + &qe_pio_b 11 0 /* USBRN */ + &qe_pio_e 20 0 /* SPEED */ + &qe_pio_e 21 0 /* POWER */>; +}; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/serial.txt b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/serial.txt new file mode 100644 index 000000000000..2ea76d9d137c --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe/serial.txt @@ -0,0 +1,32 @@ +* Serial + +Currently defined compatibles: +- fsl,cpm1-smc-uart +- fsl,cpm2-smc-uart +- fsl,cpm1-scc-uart +- fsl,cpm2-scc-uart +- fsl,qe-uart + +Modem control lines connected to GPIO controllers are listed in the gpios +property as described in booting-without-of.txt, section IX.1 in the following +order: + +CTS, RTS, DCD, DSR, DTR, and RI. + +The gpios property is optional and can be left out when control lines are +not used. + +Example: + + serial@11a00 { + device_type = "serial"; + compatible = "fsl,mpc8272-scc-uart", + "fsl,cpm2-scc-uart"; + reg = <11a00 20 8000 100>; + interrupts = <28 8>; + interrupt-parent = <&PIC>; + fsl,cpm-brg = <1>; + fsl,cpm-command = <00800000>; + gpios = <&gpio_c 15 0 + &gpio_d 29 0>; + }; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/diu.txt b/Documentation/devicetree/bindings/powerpc/fsl/diu.txt new file mode 100644 index 000000000000..b66cb6d31d69 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/diu.txt @@ -0,0 +1,34 @@ +* Freescale Display Interface Unit + +The Freescale DIU is a LCD controller, with proper hardware, it can also +drive DVI monitors. + +Required properties: +- compatible : should be "fsl,diu" or "fsl,mpc5121-diu". +- reg : should contain at least address and length of the DIU register + set. +- interrupts : one DIU interrupt should be described here. +- interrupt-parent : the phandle for the interrupt controller that + services interrupts for this device. + +Optional properties: +- edid : verbatim EDID data block describing attached display. + Data from the detailed timing descriptor will be used to + program the display controller. + +Example (MPC8610HPCD): + display@2c000 { + compatible = "fsl,diu"; + reg = <0x2c000 100>; + interrupts = <72 2>; + interrupt-parent = <&mpic>; + }; + +Example for MPC5121: + display@2100 { + compatible = "fsl,mpc5121-diu"; + reg = <0x2100 0x100>; + interrupts = <64 0x8>; + interrupt-parent = <&ipic>; + edid = [edid-data]; + }; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/dma.txt b/Documentation/devicetree/bindings/powerpc/fsl/dma.txt new file mode 100644 index 000000000000..2a4b4bce6110 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/dma.txt @@ -0,0 +1,144 @@ +* Freescale 83xx DMA Controller + +Freescale PowerPC 83xx have on chip general purpose DMA controllers. + +Required properties: + +- compatible : compatible list, contains 2 entries, first is + "fsl,CHIP-dma", where CHIP is the processor + (mpc8349, mpc8360, etc.) and the second is + "fsl,elo-dma" +- reg : +- ranges : Should be defined as specified in 1) to describe the + DMA controller channels. +- cell-index : controller index. 0 for controller @ 0x8100 +- interrupts : +- interrupt-parent : optional, if needed for interrupt mapping + + +- DMA channel nodes: + - compatible : compatible list, contains 2 entries, first is + "fsl,CHIP-dma-channel", where CHIP is the processor + (mpc8349, mpc8350, etc.) and the second is + "fsl,elo-dma-channel". However, see note below. + - reg : + - cell-index : dma channel index starts at 0. + +Optional properties: + - interrupts : + (on 83xx this is expected to be identical to + the interrupts property of the parent node) + - interrupt-parent : optional, if needed for interrupt mapping + +Example: + dma@82a8 { + #address-cells = <1>; + #size-cells = <1>; + compatible = "fsl,mpc8349-dma", "fsl,elo-dma"; + reg = <0x82a8 4>; + ranges = <0 0x8100 0x1a4>; + interrupt-parent = <&ipic>; + interrupts = <71 8>; + cell-index = <0>; + dma-channel@0 { + compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel"; + cell-index = <0>; + reg = <0 0x80>; + interrupt-parent = <&ipic>; + interrupts = <71 8>; + }; + dma-channel@80 { + compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel"; + cell-index = <1>; + reg = <0x80 0x80>; + interrupt-parent = <&ipic>; + interrupts = <71 8>; + }; + dma-channel@100 { + compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel"; + cell-index = <2>; + reg = <0x100 0x80>; + interrupt-parent = <&ipic>; + interrupts = <71 8>; + }; + dma-channel@180 { + compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel"; + cell-index = <3>; + reg = <0x180 0x80>; + interrupt-parent = <&ipic>; + interrupts = <71 8>; + }; + }; + +* Freescale 85xx/86xx DMA Controller + +Freescale PowerPC 85xx/86xx have on chip general purpose DMA controllers. + +Required properties: + +- compatible : compatible list, contains 2 entries, first is + "fsl,CHIP-dma", where CHIP is the processor + (mpc8540, mpc8540, etc.) and the second is + "fsl,eloplus-dma" +- reg : +- cell-index : controller index. 0 for controller @ 0x21000, + 1 for controller @ 0xc000 +- ranges : Should be defined as specified in 1) to describe the + DMA controller channels. + +- DMA channel nodes: + - compatible : compatible list, contains 2 entries, first is + "fsl,CHIP-dma-channel", where CHIP is the processor + (mpc8540, mpc8560, etc.) and the second is + "fsl,eloplus-dma-channel". However, see note below. + - cell-index : dma channel index starts at 0. + - reg : + - interrupts : + - interrupt-parent : optional, if needed for interrupt mapping + +Example: + dma@21300 { + #address-cells = <1>; + #size-cells = <1>; + compatible = "fsl,mpc8540-dma", "fsl,eloplus-dma"; + reg = <0x21300 4>; + ranges = <0 0x21100 0x200>; + cell-index = <0>; + dma-channel@0 { + compatible = "fsl,mpc8540-dma-channel", "fsl,eloplus-dma-channel"; + reg = <0 0x80>; + cell-index = <0>; + interrupt-parent = <&mpic>; + interrupts = <20 2>; + }; + dma-channel@80 { + compatible = "fsl,mpc8540-dma-channel", "fsl,eloplus-dma-channel"; + reg = <0x80 0x80>; + cell-index = <1>; + interrupt-parent = <&mpic>; + interrupts = <21 2>; + }; + dma-channel@100 { + compatible = "fsl,mpc8540-dma-channel", "fsl,eloplus-dma-channel"; + reg = <0x100 0x80>; + cell-index = <2>; + interrupt-parent = <&mpic>; + interrupts = <22 2>; + }; + dma-channel@180 { + compatible = "fsl,mpc8540-dma-channel", "fsl,eloplus-dma-channel"; + reg = <0x180 0x80>; + cell-index = <3>; + interrupt-parent = <&mpic>; + interrupts = <23 2>; + }; + }; + +Note on DMA channel compatible properties: The compatible property must say +"fsl,elo-dma-channel" or "fsl,eloplus-dma-channel" to be used by the Elo DMA +driver (fsldma). Any DMA channel used by fsldma cannot be used by another +DMA driver, such as the SSI sound drivers for the MPC8610. Therefore, any DMA +channel that should be used for another driver should not use +"fsl,elo-dma-channel" or "fsl,eloplus-dma-channel". For the SSI drivers, for +example, the compatible property should be "fsl,ssi-dma-channel". See ssi.txt +for more information. diff --git a/Documentation/devicetree/bindings/powerpc/fsl/ecm.txt b/Documentation/devicetree/bindings/powerpc/fsl/ecm.txt new file mode 100644 index 000000000000..f514f29c67d6 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/ecm.txt @@ -0,0 +1,64 @@ +===================================================================== +E500 LAW & Coherency Module Device Tree Binding +Copyright (C) 2009 Freescale Semiconductor Inc. +===================================================================== + +Local Access Window (LAW) Node + +The LAW node represents the region of CCSR space where local access +windows are configured. For ECM based devices this is the first 4k +of CCSR space that includes CCSRBAR, ALTCBAR, ALTCAR, BPTR, and some +number of local access windows as specified by fsl,num-laws. + +PROPERTIES + + - compatible + Usage: required + Value type: + Definition: Must include "fsl,ecm-law" + + - reg + Usage: required + Value type: + Definition: A standard property. The value specifies the + physical address offset and length of the CCSR space + registers. + + - fsl,num-laws + Usage: required + Value type: + Definition: The value specifies the number of local access + windows for this device. + +===================================================================== + +E500 Coherency Module Node + +The E500 LAW node represents the region of CCSR space where ECM config +and error reporting registers exist, this is the second 4k (0x1000) +of CCSR space. + +PROPERTIES + + - compatible + Usage: required + Value type: + Definition: Must include "fsl,CHIP-ecm", "fsl,ecm" where + CHIP is the processor (mpc8572, mpc8544, etc.) + + - reg + Usage: required + Value type: + Definition: A standard property. The value specifies the + physical address offset and length of the CCSR space + registers. + + - interrupts + Usage: required + Value type: + + - interrupt-parent + Usage: required + Value type: + +===================================================================== diff --git a/Documentation/devicetree/bindings/powerpc/fsl/gtm.txt b/Documentation/devicetree/bindings/powerpc/fsl/gtm.txt new file mode 100644 index 000000000000..9a33efded4bc --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/gtm.txt @@ -0,0 +1,31 @@ +* Freescale General-purpose Timers Module + +Required properties: + - compatible : should be + "fsl,-gtm", "fsl,gtm" for SOC GTMs + "fsl,-qe-gtm", "fsl,qe-gtm", "fsl,gtm" for QE GTMs + "fsl,-cpm2-gtm", "fsl,cpm2-gtm", "fsl,gtm" for CPM2 GTMs + - reg : should contain gtm registers location and length (0x40). + - interrupts : should contain four interrupts. + - interrupt-parent : interrupt source phandle. + - clock-frequency : specifies the frequency driving the timer. + +Example: + +timer@500 { + compatible = "fsl,mpc8360-gtm", "fsl,gtm"; + reg = <0x500 0x40>; + interrupts = <90 8 78 8 84 8 72 8>; + interrupt-parent = <&ipic>; + /* filled by u-boot */ + clock-frequency = <0>; +}; + +timer@440 { + compatible = "fsl,mpc8360-qe-gtm", "fsl,qe-gtm", "fsl,gtm"; + reg = <0x440 0x40>; + interrupts = <12 13 14 15>; + interrupt-parent = <&qeic>; + /* filled by u-boot */ + clock-frequency = <0>; +}; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/guts.txt b/Documentation/devicetree/bindings/powerpc/fsl/guts.txt new file mode 100644 index 000000000000..9e7a2417dac5 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/guts.txt @@ -0,0 +1,25 @@ +* Global Utilities Block + +The global utilities block controls power management, I/O device +enabling, power-on-reset configuration monitoring, general-purpose +I/O signal configuration, alternate function selection for multiplexed +signals, and clock control. + +Required properties: + + - compatible : Should define the compatible device type for + global-utilities. + - reg : Offset and length of the register set for the device. + +Recommended properties: + + - fsl,has-rstcr : Indicates that the global utilities register set + contains a functioning "reset control register" (i.e. the board + is wired to reset upon setting the HRESET_REQ bit in this register). + +Example: + global-utilities@e0000 { /* global utilities block */ + compatible = "fsl,mpc8548-guts"; + reg = ; + fsl,has-rstcr; + }; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/lbc.txt b/Documentation/devicetree/bindings/powerpc/fsl/lbc.txt new file mode 100644 index 000000000000..3300fec501c5 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/lbc.txt @@ -0,0 +1,35 @@ +* Chipselect/Local Bus + +Properties: +- name : Should be localbus +- #address-cells : Should be either two or three. The first cell is the + chipselect number, and the remaining cells are the + offset into the chipselect. +- #size-cells : Either one or two, depending on how large each chipselect + can be. +- ranges : Each range corresponds to a single chipselect, and cover + the entire access window as configured. + +Example: + localbus@f0010100 { + compatible = "fsl,mpc8272-localbus", + "fsl,pq2-localbus"; + #address-cells = <2>; + #size-cells = <1>; + reg = ; + + ranges = <0 0 fe000000 02000000 + 1 0 f4500000 00008000>; + + flash@0,0 { + compatible = "jedec-flash"; + reg = <0 0 2000000>; + bank-width = <4>; + device-width = <1>; + }; + + board-control@1,0 { + reg = <1 0 20>; + compatible = "fsl,mpc8272ads-bcsr"; + }; + }; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/mcm.txt b/Documentation/devicetree/bindings/powerpc/fsl/mcm.txt new file mode 100644 index 000000000000..4ceda9b3b413 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/mcm.txt @@ -0,0 +1,64 @@ +===================================================================== +MPX LAW & Coherency Module Device Tree Binding +Copyright (C) 2009 Freescale Semiconductor Inc. +===================================================================== + +Local Access Window (LAW) Node + +The LAW node represents the region of CCSR space where local access +windows are configured. For MCM based devices this is the first 4k +of CCSR space that includes CCSRBAR, ALTCBAR, ALTCAR, BPTR, and some +number of local access windows as specified by fsl,num-laws. + +PROPERTIES + + - compatible + Usage: required + Value type: + Definition: Must include "fsl,mcm-law" + + - reg + Usage: required + Value type: + Definition: A standard property. The value specifies the + physical address offset and length of the CCSR space + registers. + + - fsl,num-laws + Usage: required + Value type: + Definition: The value specifies the number of local access + windows for this device. + +===================================================================== + +MPX Coherency Module Node + +The MPX LAW node represents the region of CCSR space where MCM config +and error reporting registers exist, this is the second 4k (0x1000) +of CCSR space. + +PROPERTIES + + - compatible + Usage: required + Value type: + Definition: Must include "fsl,CHIP-mcm", "fsl,mcm" where + CHIP is the processor (mpc8641, mpc8610, etc.) + + - reg + Usage: required + Value type: + Definition: A standard property. The value specifies the + physical address offset and length of the CCSR space + registers. + + - interrupts + Usage: required + Value type: + + - interrupt-parent + Usage: required + Value type: + +===================================================================== diff --git a/Documentation/devicetree/bindings/powerpc/fsl/mcu-mpc8349emitx.txt b/Documentation/devicetree/bindings/powerpc/fsl/mcu-mpc8349emitx.txt new file mode 100644 index 000000000000..0f766333b6eb --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/mcu-mpc8349emitx.txt @@ -0,0 +1,17 @@ +Freescale MPC8349E-mITX-compatible Power Management Micro Controller Unit (MCU) + +Required properties: +- compatible : "fsl,-", "fsl,mcu-mpc8349emitx". +- reg : should specify I2C address (0x0a). +- #gpio-cells : should be 2. +- gpio-controller : should be present. + +Example: + +mcu@0a { + #gpio-cells = <2>; + compatible = "fsl,mc9s08qg8-mpc8349emitx", + "fsl,mcu-mpc8349emitx"; + reg = <0x0a>; + gpio-controller; +}; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/mpc5121-psc.txt b/Documentation/devicetree/bindings/powerpc/fsl/mpc5121-psc.txt new file mode 100644 index 000000000000..8832e8798912 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/mpc5121-psc.txt @@ -0,0 +1,70 @@ +MPC5121 PSC Device Tree Bindings + +PSC in UART mode +---------------- + +For PSC in UART mode the needed PSC serial devices +are specified by fsl,mpc5121-psc-uart nodes in the +fsl,mpc5121-immr SoC node. Additionally the PSC FIFO +Controller node fsl,mpc5121-psc-fifo is requered there: + +fsl,mpc5121-psc-uart nodes +-------------------------- + +Required properties : + - compatible : Should contain "fsl,mpc5121-psc-uart" and "fsl,mpc5121-psc" + - cell-index : Index of the PSC in hardware + - reg : Offset and length of the register set for the PSC device + - interrupts : where a is the interrupt number of the + PSC FIFO Controller and b is a field that represents an + encoding of the sense and level information for the interrupt. + - interrupt-parent : the phandle for the interrupt controller that + services interrupts for this device. + +Recommended properties : + - fsl,rx-fifo-size : the size of the RX fifo slice (a multiple of 4) + - fsl,tx-fifo-size : the size of the TX fifo slice (a multiple of 4) + + +fsl,mpc5121-psc-fifo node +------------------------- + +Required properties : + - compatible : Should be "fsl,mpc5121-psc-fifo" + - reg : Offset and length of the register set for the PSC + FIFO Controller + - interrupts : where a is the interrupt number of the + PSC FIFO Controller and b is a field that represents an + encoding of the sense and level information for the interrupt. + - interrupt-parent : the phandle for the interrupt controller that + services interrupts for this device. + + +Example for a board using PSC0 and PSC1 devices in serial mode: + +serial@11000 { + compatible = "fsl,mpc5121-psc-uart", "fsl,mpc5121-psc"; + cell-index = <0>; + reg = <0x11000 0x100>; + interrupts = <40 0x8>; + interrupt-parent = < &ipic >; + fsl,rx-fifo-size = <16>; + fsl,tx-fifo-size = <16>; +}; + +serial@11100 { + compatible = "fsl,mpc5121-psc-uart", "fsl,mpc5121-psc"; + cell-index = <1>; + reg = <0x11100 0x100>; + interrupts = <40 0x8>; + interrupt-parent = < &ipic >; + fsl,rx-fifo-size = <16>; + fsl,tx-fifo-size = <16>; +}; + +pscfifo@11f00 { + compatible = "fsl,mpc5121-psc-fifo"; + reg = <0x11f00 0x100>; + interrupts = <40 0x8>; + interrupt-parent = < &ipic >; +}; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/mpc5200.txt b/Documentation/devicetree/bindings/powerpc/fsl/mpc5200.txt new file mode 100644 index 000000000000..4ccb2cd5df94 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/mpc5200.txt @@ -0,0 +1,198 @@ +MPC5200 Device Tree Bindings +---------------------------- + +(c) 2006-2009 Secret Lab Technologies Ltd +Grant Likely + +Naming conventions +------------------ +For mpc5200 on-chip devices, the format for each compatible value is +-[-]. The OS should be able to match a device driver +to the device based solely on the compatible value. If two drivers +match on the compatible list; the 'most compatible' driver should be +selected. + +The split between the MPC5200 and the MPC5200B leaves a bit of a +conundrum. How should the compatible property be set up to provide +maximum compatibility information; but still accurately describe the +chip? For the MPC5200; the answer is easy. Most of the SoC devices +originally appeared on the MPC5200. Since they didn't exist anywhere +else; the 5200 compatible properties will contain only one item; +"fsl,mpc5200-". + +The 5200B is almost the same as the 5200, but not quite. It fixes +silicon bugs and it adds a small number of enhancements. Most of the +devices either provide exactly the same interface as on the 5200. A few +devices have extra functions but still have a backwards compatible mode. +To express this information as completely as possible, 5200B device trees +should have two items in the compatible list: + compatible = "fsl,mpc5200b-","fsl,mpc5200-"; + +It is *strongly* recommended that 5200B device trees follow this convention +(instead of only listing the base mpc5200 item). + +ie. ethernet on mpc5200: compatible = "fsl,mpc5200-fec"; + ethernet on mpc5200b: compatible = "fsl,mpc5200b-fec", "fsl,mpc5200-fec"; + +Modal devices, like PSCs, also append the configured function to the +end of the compatible field. ie. A PSC in i2s mode would specify +"fsl,mpc5200-psc-i2s", not "fsl,mpc5200-i2s". This convention is chosen to +avoid naming conflicts with non-psc devices providing the same +function. For example, "fsl,mpc5200-spi" and "fsl,mpc5200-psc-spi" describe +the mpc5200 simple spi device and a PSC spi mode respectively. + +At the time of writing, exact chip may be either 'fsl,mpc5200' or +'fsl,mpc5200b'. + +The soc node +------------ +This node describes the on chip SOC peripherals. Every mpc5200 based +board will have this node, and as such there is a common naming +convention for SOC devices. + +Required properties: +name description +---- ----------- +ranges Memory range of the internal memory mapped registers. + Should be <0 [baseaddr] 0xc000> +reg Should be <[baseaddr] 0x100> +compatible mpc5200: "fsl,mpc5200-immr" + mpc5200b: "fsl,mpc5200b-immr" +system-frequency 'fsystem' frequency in Hz; XLB, IPB, USB and PCI + clocks are derived from the fsystem clock. +bus-frequency IPB bus frequency in Hz. Clock rate + used by most of the soc devices. + +soc child nodes +--------------- +Any on chip SOC devices available to Linux must appear as soc5200 child nodes. + +Note: The tables below show the value for the mpc5200. A mpc5200b device +tree should use the "fsl,mpc5200b-","fsl,mpc5200-" form. + +Required soc5200 child nodes: +name compatible Description +---- ---------- ----------- +cdm@ fsl,mpc5200-cdm Clock Distribution +interrupt-controller@ fsl,mpc5200-pic need an interrupt + controller to boot +bestcomm@ fsl,mpc5200-bestcomm Bestcomm DMA controller + +Recommended soc5200 child nodes; populate as needed for your board +name compatible Description +---- ---------- ----------- +timer@ fsl,mpc5200-gpt General purpose timers +gpio@ fsl,mpc5200-gpio MPC5200 simple gpio controller +gpio@ fsl,mpc5200-gpio-wkup MPC5200 wakeup gpio controller +rtc@ fsl,mpc5200-rtc Real time clock +mscan@ fsl,mpc5200-mscan CAN bus controller +pci@ fsl,mpc5200-pci PCI bridge +serial@ fsl,mpc5200-psc-uart PSC in serial mode +i2s@ fsl,mpc5200-psc-i2s PSC in i2s mode +ac97@ fsl,mpc5200-psc-ac97 PSC in ac97 mode +spi@ fsl,mpc5200-psc-spi PSC in spi mode +irda@ fsl,mpc5200-psc-irda PSC in IrDA mode +spi@ fsl,mpc5200-spi MPC5200 spi device +ethernet@ fsl,mpc5200-fec MPC5200 ethernet device +ata@ fsl,mpc5200-ata IDE ATA interface +i2c@ fsl,mpc5200-i2c I2C controller +usb@ fsl,mpc5200-ohci,ohci-be USB controller +xlb@ fsl,mpc5200-xlb XLB arbitrator + +fsl,mpc5200-gpt nodes +--------------------- +On the mpc5200 and 5200b, GPT0 has a watchdog timer function. If the board +design supports the internal wdt, then the device node for GPT0 should +include the empty property 'fsl,has-wdt'. Note that this does not activate +the watchdog. The timer will function as a GPT if the timer api is used, and +it will function as watchdog if the watchdog device is used. The watchdog +mode has priority over the gpt mode, i.e. if the watchdog is activated, any +gpt api call to this timer will fail with -EBUSY. + +If you add the property + fsl,wdt-on-boot = ; +GPT0 will be marked as in-use watchdog, i.e. blocking every gpt access to it. +If n>0, the watchdog is started with a timeout of n seconds. If n=0, the +configuration of the watchdog is not touched. This is useful in two cases: +- just mark GPT0 as watchdog, blocking gpt accesses, and configure it later; +- do not touch a configuration assigned by the boot loader which supervises + the boot process itself. + +The watchdog will respect the CONFIG_WATCHDOG_NOWAYOUT option. + +An mpc5200-gpt can be used as a single line GPIO controller. To do so, +add the following properties to the gpt node: + gpio-controller; + #gpio-cells = <2>; +When referencing the GPIO line from another node, the first cell must always +be zero and the second cell represents the gpio flags and described in the +gpio device tree binding. + +An mpc5200-gpt can be used as a single line edge sensitive interrupt +controller. To do so, add the following properties to the gpt node: + interrupt-controller; + #interrupt-cells = <1>; +When referencing the IRQ line from another node, the cell represents the +sense mode; 1 for edge rising, 2 for edge falling. + +fsl,mpc5200-psc nodes +--------------------- +The PSCs should include a cell-index which is the index of the PSC in +hardware. cell-index is used to determine which shared SoC registers to +use when setting up PSC clocking. cell-index number starts at '0'. ie: + PSC1 has 'cell-index = <0>' + PSC4 has 'cell-index = <3>' + +PSC in i2s mode: The mpc5200 and mpc5200b PSCs are not compatible when in +i2s mode. An 'mpc5200b-psc-i2s' node cannot include 'mpc5200-psc-i2s' in the +compatible field. + + +fsl,mpc5200-gpio and fsl,mpc5200-gpio-wkup nodes +------------------------------------------------ +Each GPIO controller node should have the empty property gpio-controller and +#gpio-cells set to 2. First cell is the GPIO number which is interpreted +according to the bit numbers in the GPIO control registers. The second cell +is for flags which is currently unused. + +fsl,mpc5200-fec nodes +--------------------- +The FEC node can specify one of the following properties to configure +the MII link: +- fsl,7-wire-mode - An empty property that specifies the link uses 7-wire + mode instead of MII +- current-speed - Specifies that the MII should be configured for a fixed + speed. This property should contain two cells. The + first cell specifies the speed in Mbps and the second + should be '0' for half duplex and '1' for full duplex +- phy-handle - Contains a phandle to an Ethernet PHY. + +Interrupt controller (fsl,mpc5200-pic) node +------------------------------------------- +The mpc5200 pic binding splits hardware IRQ numbers into two levels. The +split reflects the layout of the PIC hardware itself, which groups +interrupts into one of three groups; CRIT, MAIN or PERP. Also, the +Bestcomm dma engine has it's own set of interrupt sources which are +cascaded off of peripheral interrupt 0, which the driver interprets as a +fourth group, SDMA. + +The interrupts property for device nodes using the mpc5200 pic consists +of three cells; + + L1 := [CRIT=0, MAIN=1, PERP=2, SDMA=3] + L2 := interrupt number; directly mapped from the value in the + "ICTL PerStat, MainStat, CritStat Encoded Register" + level := [LEVEL_HIGH=0, EDGE_RISING=1, EDGE_FALLING=2, LEVEL_LOW=3] + +For external IRQs, use the following interrupt property values (how to +specify external interrupts is a frequently asked question): +External interrupts: + external irq0: interrupts = <0 0 n>; + external irq1: interrupts = <1 1 n>; + external irq2: interrupts = <1 2 n>; + external irq3: interrupts = <1 3 n>; +'n' is sense (0: level high, 1: edge rising, 2: edge falling 3: level low) + +fsl,mpc5200-mscan nodes +----------------------- +See file can.txt in this directory. diff --git a/Documentation/devicetree/bindings/powerpc/fsl/mpic.txt b/Documentation/devicetree/bindings/powerpc/fsl/mpic.txt new file mode 100644 index 000000000000..71e39cf3215b --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/mpic.txt @@ -0,0 +1,42 @@ +* OpenPIC and its interrupt numbers on Freescale's e500/e600 cores + +The OpenPIC specification does not specify which interrupt source has to +become which interrupt number. This is up to the software implementation +of the interrupt controller. The only requirement is that every +interrupt source has to have an unique interrupt number / vector number. +To accomplish this the current implementation assigns the number zero to +the first source, the number one to the second source and so on until +all interrupt sources have their unique number. +Usually the assigned vector number equals the interrupt number mentioned +in the documentation for a given core / CPU. This is however not true +for the e500 cores (MPC85XX CPUs) where the documentation distinguishes +between internal and external interrupt sources and starts counting at +zero for both of them. + +So what to write for external interrupt source X or internal interrupt +source Y into the device tree? Here is an example: + +The memory map for the interrupt controller in the MPC8544[0] shows, +that the first interrupt source starts at 0x5_0000 (PIC Register Address +Map-Interrupt Source Configuration Registers). This source becomes the +number zero therefore: + External interrupt 0 = interrupt number 0 + External interrupt 1 = interrupt number 1 + External interrupt 2 = interrupt number 2 + ... +Every interrupt number allocates 0x20 bytes register space. So to get +its number it is sufficient to shift the lower 16bits to right by five. +So for the external interrupt 10 we have: + 0x0140 >> 5 = 10 + +After the external sources, the internal sources follow. The in core I2C +controller on the MPC8544 for instance has the internal source number +27. Oo obtain its interrupt number we take the lower 16bits of its memory +address (0x5_0560) and shift it right: + 0x0560 >> 5 = 43 + +Therefore the I2C device node for the MPC8544 CPU has to have the +interrupt number 43 specified in the device tree. + +[0] MPC8544E PowerQUICCTM III, Integrated Host Processor Family Reference Manual + MPC8544ERM Rev. 1 10/2007 diff --git a/Documentation/devicetree/bindings/powerpc/fsl/msi-pic.txt b/Documentation/devicetree/bindings/powerpc/fsl/msi-pic.txt new file mode 100644 index 000000000000..bcc30bac6831 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/msi-pic.txt @@ -0,0 +1,36 @@ +* Freescale MSI interrupt controller + +Required properties: +- compatible : compatible list, contains 2 entries, + first is "fsl,CHIP-msi", where CHIP is the processor(mpc8610, mpc8572, + etc.) and the second is "fsl,mpic-msi" or "fsl,ipic-msi" depending on + the parent type. +- reg : should contain the address and the length of the shared message + interrupt register set. +- msi-available-ranges: use style section to define which + msi interrupt can be used in the 256 msi interrupts. This property is + optional, without this, all the 256 MSI interrupts can be used. +- interrupts : each one of the interrupts here is one entry per 32 MSIs, + and routed to the host interrupt controller. the interrupts should + be set as edge sensitive. +- interrupt-parent: the phandle for the interrupt controller + that services interrupts for this device. for 83xx cpu, the interrupts + are routed to IPIC, and for 85xx/86xx cpu the interrupts are routed + to MPIC. + +Example: + msi@41600 { + compatible = "fsl,mpc8610-msi", "fsl,mpic-msi"; + reg = <0x41600 0x80>; + msi-available-ranges = <0 0x100>; + interrupts = < + 0xe0 0 + 0xe1 0 + 0xe2 0 + 0xe3 0 + 0xe4 0 + 0xe5 0 + 0xe6 0 + 0xe7 0>; + interrupt-parent = <&mpic>; + }; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/pmc.txt b/Documentation/devicetree/bindings/powerpc/fsl/pmc.txt new file mode 100644 index 000000000000..07256b7ffcaa --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/pmc.txt @@ -0,0 +1,63 @@ +* Power Management Controller + +Properties: +- compatible: "fsl,-pmc". + + "fsl,mpc8349-pmc" should be listed for any chip whose PMC is + compatible. "fsl,mpc8313-pmc" should also be listed for any chip + whose PMC is compatible, and implies deep-sleep capability. + + "fsl,mpc8548-pmc" should be listed for any chip whose PMC is + compatible. "fsl,mpc8536-pmc" should also be listed for any chip + whose PMC is compatible, and implies deep-sleep capability. + + "fsl,mpc8641d-pmc" should be listed for any chip whose PMC is + compatible; all statements below that apply to "fsl,mpc8548-pmc" also + apply to "fsl,mpc8641d-pmc". + + Compatibility does not include bit assignments in SCCR/PMCDR/DEVDISR; these + bit assignments are indicated via the sleep specifier in each device's + sleep property. + +- reg: For devices compatible with "fsl,mpc8349-pmc", the first resource + is the PMC block, and the second resource is the Clock Configuration + block. + + For devices compatible with "fsl,mpc8548-pmc", the first resource + is a 32-byte block beginning with DEVDISR. + +- interrupts: For "fsl,mpc8349-pmc"-compatible devices, the first + resource is the PMC block interrupt. + +- fsl,mpc8313-wakeup-timer: For "fsl,mpc8313-pmc"-compatible devices, + this is a phandle to an "fsl,gtm" node on which timer 4 can be used as + a wakeup source from deep sleep. + +Sleep specifiers: + + fsl,mpc8349-pmc: Sleep specifiers consist of one cell. For each bit + that is set in the cell, the corresponding bit in SCCR will be saved + and cleared on suspend, and restored on resume. This sleep controller + supports disabling and resuming devices at any time. + + fsl,mpc8536-pmc: Sleep specifiers consist of three cells, the third of + which will be ORed into PMCDR upon suspend, and cleared from PMCDR + upon resume. The first two cells are as described for fsl,mpc8578-pmc. + This sleep controller only supports disabling devices during system + sleep, or permanently. + + fsl,mpc8548-pmc: Sleep specifiers consist of one or two cells, the + first of which will be ORed into DEVDISR (and the second into + DEVDISR2, if present -- this cell should be zero or absent if the + hardware does not have DEVDISR2) upon a request for permanent device + disabling. This sleep controller does not support configuring devices + to disable during system sleep (unless supported by another compatible + match), or dynamically. + +Example: + + power@b00 { + compatible = "fsl,mpc8313-pmc", "fsl,mpc8349-pmc"; + reg = <0xb00 0x100 0xa00 0x100>; + interrupts = <80 8>; + }; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/sec.txt b/Documentation/devicetree/bindings/powerpc/fsl/sec.txt new file mode 100644 index 000000000000..2b6f2d45c45a --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/sec.txt @@ -0,0 +1,68 @@ +Freescale SoC SEC Security Engines + +Required properties: + +- compatible : Should contain entries for this and backward compatible + SEC versions, high to low, e.g., "fsl,sec2.1", "fsl,sec2.0" +- reg : Offset and length of the register set for the device +- interrupts : the SEC's interrupt number +- fsl,num-channels : An integer representing the number of channels + available. +- fsl,channel-fifo-len : An integer representing the number of + descriptor pointers each channel fetch fifo can hold. +- fsl,exec-units-mask : The bitmask representing what execution units + (EUs) are available. It's a single 32-bit cell. EU information + should be encoded following the SEC's Descriptor Header Dword + EU_SEL0 field documentation, i.e. as follows: + + bit 0 = reserved - should be 0 + bit 1 = set if SEC has the ARC4 EU (AFEU) + bit 2 = set if SEC has the DES/3DES EU (DEU) + bit 3 = set if SEC has the message digest EU (MDEU/MDEU-A) + bit 4 = set if SEC has the random number generator EU (RNG) + bit 5 = set if SEC has the public key EU (PKEU) + bit 6 = set if SEC has the AES EU (AESU) + bit 7 = set if SEC has the Kasumi EU (KEU) + bit 8 = set if SEC has the CRC EU (CRCU) + bit 11 = set if SEC has the message digest EU extended alg set (MDEU-B) + +remaining bits are reserved for future SEC EUs. + +- fsl,descriptor-types-mask : The bitmask representing what descriptors + are available. It's a single 32-bit cell. Descriptor type information + should be encoded following the SEC's Descriptor Header Dword DESC_TYPE + field documentation, i.e. as follows: + + bit 0 = set if SEC supports the aesu_ctr_nonsnoop desc. type + bit 1 = set if SEC supports the ipsec_esp descriptor type + bit 2 = set if SEC supports the common_nonsnoop desc. type + bit 3 = set if SEC supports the 802.11i AES ccmp desc. type + bit 4 = set if SEC supports the hmac_snoop_no_afeu desc. type + bit 5 = set if SEC supports the srtp descriptor type + bit 6 = set if SEC supports the non_hmac_snoop_no_afeu desc.type + bit 7 = set if SEC supports the pkeu_assemble descriptor type + bit 8 = set if SEC supports the aesu_key_expand_output desc.type + bit 9 = set if SEC supports the pkeu_ptmul descriptor type + bit 10 = set if SEC supports the common_nonsnoop_afeu desc. type + bit 11 = set if SEC supports the pkeu_ptadd_dbl descriptor type + + ..and so on and so forth. + +Optional properties: + +- interrupt-parent : the phandle for the interrupt controller that + services interrupts for this device. + +Example: + + /* MPC8548E */ + crypto@30000 { + compatible = "fsl,sec2.1", "fsl,sec2.0"; + reg = <0x30000 0x10000>; + interrupts = <29 2>; + interrupt-parent = <&mpic>; + fsl,num-channels = <4>; + fsl,channel-fifo-len = <24>; + fsl,exec-units-mask = <0xfe>; + fsl,descriptor-types-mask = <0x12b0ebf>; + }; diff --git a/Documentation/devicetree/bindings/powerpc/fsl/ssi.txt b/Documentation/devicetree/bindings/powerpc/fsl/ssi.txt new file mode 100644 index 000000000000..5ff76c9c57d2 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/ssi.txt @@ -0,0 +1,73 @@ +Freescale Synchronous Serial Interface + +The SSI is a serial device that communicates with audio codecs. It can +be programmed in AC97, I2S, left-justified, or right-justified modes. + +Required properties: +- compatible: Compatible list, contains "fsl,ssi". +- cell-index: The SSI, <0> = SSI1, <1> = SSI2, and so on. +- reg: Offset and length of the register set for the device. +- interrupts: where a is the interrupt number and b is a + field that represents an encoding of the sense and + level information for the interrupt. This should be + encoded based on the information in section 2) + depending on the type of interrupt controller you + have. +- interrupt-parent: The phandle for the interrupt controller that + services interrupts for this device. +- fsl,mode: The operating mode for the SSI interface. + "i2s-slave" - I2S mode, SSI is clock slave + "i2s-master" - I2S mode, SSI is clock master + "lj-slave" - left-justified mode, SSI is clock slave + "lj-master" - l.j. mode, SSI is clock master + "rj-slave" - right-justified mode, SSI is clock slave + "rj-master" - r.j., SSI is clock master + "ac97-slave" - AC97 mode, SSI is clock slave + "ac97-master" - AC97 mode, SSI is clock master +- fsl,playback-dma: Phandle to a node for the DMA channel to use for + playback of audio. This is typically dictated by SOC + design. See the notes below. +- fsl,capture-dma: Phandle to a node for the DMA channel to use for + capture (recording) of audio. This is typically dictated + by SOC design. See the notes below. +- fsl,fifo-depth: The number of elements in the transmit and receive FIFOs. + This number is the maximum allowed value for SFCSR[TFWM0]. +- fsl,ssi-asynchronous: + If specified, the SSI is to be programmed in asynchronous + mode. In this mode, pins SRCK, STCK, SRFS, and STFS must + all be connected to valid signals. In synchronous mode, + SRCK and SRFS are ignored. Asynchronous mode allows + playback and capture to use different sample sizes and + sample rates. Some drivers may require that SRCK and STCK + be connected together, and SRFS and STFS be connected + together. This would still allow different sample sizes, + but not different sample rates. + +Optional properties: +- codec-handle: Phandle to a 'codec' node that defines an audio + codec connected to this SSI. This node is typically + a child of an I2C or other control node. + +Child 'codec' node required properties: +- compatible: Compatible list, contains the name of the codec + +Child 'codec' node optional properties: +- clock-frequency: The frequency of the input clock, which typically comes + from an on-board dedicated oscillator. + +Notes on fsl,playback-dma and fsl,capture-dma: + +On SOCs that have an SSI, specific DMA channels are hard-wired for playback +and capture. On the MPC8610, for example, SSI1 must use DMA channel 0 for +playback and DMA channel 1 for capture. SSI2 must use DMA channel 2 for +playback and DMA channel 3 for capture. The developer can choose which +DMA controller to use, but the channels themselves are hard-wired. The +purpose of these two properties is to represent this hardware design. + +The device tree nodes for the DMA channels that are referenced by +"fsl,playback-dma" and "fsl,capture-dma" must be marked as compatible with +"fsl,ssi-dma-channel". The SOC-specific compatible string (e.g. +"fsl,mpc8610-dma-channel") can remain. If these nodes are left as +"fsl,elo-dma-channel" or "fsl,eloplus-dma-channel", then the generic Elo DMA +drivers (fsldma) will attempt to use them, and it will conflict with the +sound drivers. diff --git a/Documentation/devicetree/bindings/powerpc/nintendo/gamecube.txt b/Documentation/devicetree/bindings/powerpc/nintendo/gamecube.txt new file mode 100644 index 000000000000..b558585b1aaf --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/nintendo/gamecube.txt @@ -0,0 +1,109 @@ + +Nintendo GameCube device tree +============================= + +1) The "flipper" node + + This node represents the multi-function "Flipper" chip, which packages + many of the devices found in the Nintendo GameCube. + + Required properties: + + - compatible : Should be "nintendo,flipper" + +1.a) The Video Interface (VI) node + + Represents the interface between the graphics processor and a external + video encoder. + + Required properties: + + - compatible : should be "nintendo,flipper-vi" + - reg : should contain the VI registers location and length + - interrupts : should contain the VI interrupt + +1.b) The Processor Interface (PI) node + + Represents the data and control interface between the main processor + and graphics and audio processor. + + Required properties: + + - compatible : should be "nintendo,flipper-pi" + - reg : should contain the PI registers location and length + +1.b.i) The "Flipper" interrupt controller node + + Represents the interrupt controller within the "Flipper" chip. + The node for the "Flipper" interrupt controller must be placed under + the PI node. + + Required properties: + + - compatible : should be "nintendo,flipper-pic" + +1.c) The Digital Signal Procesor (DSP) node + + Represents the digital signal processor interface, designed to offload + audio related tasks. + + Required properties: + + - compatible : should be "nintendo,flipper-dsp" + - reg : should contain the DSP registers location and length + - interrupts : should contain the DSP interrupt + +1.c.i) The Auxiliary RAM (ARAM) node + + Represents the non cpu-addressable ram designed mainly to store audio + related information. + The ARAM node must be placed under the DSP node. + + Required properties: + + - compatible : should be "nintendo,flipper-aram" + - reg : should contain the ARAM start (zero-based) and length + +1.d) The Disk Interface (DI) node + + Represents the interface used to communicate with mass storage devices. + + Required properties: + + - compatible : should be "nintendo,flipper-di" + - reg : should contain the DI registers location and length + - interrupts : should contain the DI interrupt + +1.e) The Audio Interface (AI) node + + Represents the interface to the external 16-bit stereo digital-to-analog + converter. + + Required properties: + + - compatible : should be "nintendo,flipper-ai" + - reg : should contain the AI registers location and length + - interrupts : should contain the AI interrupt + +1.f) The Serial Interface (SI) node + + Represents the interface to the four single bit serial interfaces. + The SI is a proprietary serial interface used normally to control gamepads. + It's NOT a RS232-type interface. + + Required properties: + + - compatible : should be "nintendo,flipper-si" + - reg : should contain the SI registers location and length + - interrupts : should contain the SI interrupt + +1.g) The External Interface (EXI) node + + Represents the multi-channel SPI-like interface. + + Required properties: + + - compatible : should be "nintendo,flipper-exi" + - reg : should contain the EXI registers location and length + - interrupts : should contain the EXI interrupt + diff --git a/Documentation/devicetree/bindings/powerpc/nintendo/wii.txt b/Documentation/devicetree/bindings/powerpc/nintendo/wii.txt new file mode 100644 index 000000000000..a7e155a023b8 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/nintendo/wii.txt @@ -0,0 +1,184 @@ + +Nintendo Wii device tree +======================== + +0) The root node + + This node represents the Nintendo Wii video game console. + + Required properties: + + - model : Should be "nintendo,wii" + - compatible : Should be "nintendo,wii" + +1) The "hollywood" node + + This node represents the multi-function "Hollywood" chip, which packages + many of the devices found in the Nintendo Wii. + + Required properties: + + - compatible : Should be "nintendo,hollywood" + +1.a) The Video Interface (VI) node + + Represents the interface between the graphics processor and a external + video encoder. + + Required properties: + + - compatible : should be "nintendo,hollywood-vi","nintendo,flipper-vi" + - reg : should contain the VI registers location and length + - interrupts : should contain the VI interrupt + +1.b) The Processor Interface (PI) node + + Represents the data and control interface between the main processor + and graphics and audio processor. + + Required properties: + + - compatible : should be "nintendo,hollywood-pi","nintendo,flipper-pi" + - reg : should contain the PI registers location and length + +1.b.i) The "Flipper" interrupt controller node + + Represents the "Flipper" interrupt controller within the "Hollywood" chip. + The node for the "Flipper" interrupt controller must be placed under + the PI node. + + Required properties: + + - #interrupt-cells : <1> + - compatible : should be "nintendo,flipper-pic" + - interrupt-controller + +1.c) The Digital Signal Procesor (DSP) node + + Represents the digital signal processor interface, designed to offload + audio related tasks. + + Required properties: + + - compatible : should be "nintendo,hollywood-dsp","nintendo,flipper-dsp" + - reg : should contain the DSP registers location and length + - interrupts : should contain the DSP interrupt + +1.d) The Serial Interface (SI) node + + Represents the interface to the four single bit serial interfaces. + The SI is a proprietary serial interface used normally to control gamepads. + It's NOT a RS232-type interface. + + Required properties: + + - compatible : should be "nintendo,hollywood-si","nintendo,flipper-si" + - reg : should contain the SI registers location and length + - interrupts : should contain the SI interrupt + +1.e) The Audio Interface (AI) node + + Represents the interface to the external 16-bit stereo digital-to-analog + converter. + + Required properties: + + - compatible : should be "nintendo,hollywood-ai","nintendo,flipper-ai" + - reg : should contain the AI registers location and length + - interrupts : should contain the AI interrupt + +1.f) The External Interface (EXI) node + + Represents the multi-channel SPI-like interface. + + Required properties: + + - compatible : should be "nintendo,hollywood-exi","nintendo,flipper-exi" + - reg : should contain the EXI registers location and length + - interrupts : should contain the EXI interrupt + +1.g) The Open Host Controller Interface (OHCI) nodes + + Represent the USB 1.x Open Host Controller Interfaces. + + Required properties: + + - compatible : should be "nintendo,hollywood-usb-ohci","usb-ohci" + - reg : should contain the OHCI registers location and length + - interrupts : should contain the OHCI interrupt + +1.h) The Enhanced Host Controller Interface (EHCI) node + + Represents the USB 2.0 Enhanced Host Controller Interface. + + Required properties: + + - compatible : should be "nintendo,hollywood-usb-ehci","usb-ehci" + - reg : should contain the EHCI registers location and length + - interrupts : should contain the EHCI interrupt + +1.i) The Secure Digital Host Controller Interface (SDHCI) nodes + + Represent the Secure Digital Host Controller Interfaces. + + Required properties: + + - compatible : should be "nintendo,hollywood-sdhci","sdhci" + - reg : should contain the SDHCI registers location and length + - interrupts : should contain the SDHCI interrupt + +1.j) The Inter-Processsor Communication (IPC) node + + Represent the Inter-Processor Communication interface. This interface + enables communications between the Broadway and the Starlet processors. + + - compatible : should be "nintendo,hollywood-ipc" + - reg : should contain the IPC registers location and length + - interrupts : should contain the IPC interrupt + +1.k) The "Hollywood" interrupt controller node + + Represents the "Hollywood" interrupt controller within the + "Hollywood" chip. + + Required properties: + + - #interrupt-cells : <1> + - compatible : should be "nintendo,hollywood-pic" + - reg : should contain the controller registers location and length + - interrupt-controller + - interrupts : should contain the cascade interrupt of the "flipper" pic + - interrupt-parent: should contain the phandle of the "flipper" pic + +1.l) The General Purpose I/O (GPIO) controller node + + Represents the dual access 32 GPIO controller interface. + + Required properties: + + - #gpio-cells : <2> + - compatible : should be "nintendo,hollywood-gpio" + - reg : should contain the IPC registers location and length + - gpio-controller + +1.m) The control node + + Represents the control interface used to setup several miscellaneous + settings of the "Hollywood" chip like boot memory mappings, resets, + disk interface mode, etc. + + Required properties: + + - compatible : should be "nintendo,hollywood-control" + - reg : should contain the control registers location and length + +1.n) The Disk Interface (DI) node + + Represents the interface used to communicate with mass storage devices. + + Required properties: + + - compatible : should be "nintendo,hollywood-di" + - reg : should contain the DI registers location and length + - interrupts : should contain the DI interrupt + diff --git a/Documentation/devicetree/bindings/spi/fsl-spi.txt b/Documentation/devicetree/bindings/spi/fsl-spi.txt new file mode 100644 index 000000000000..777abd7399d5 --- /dev/null +++ b/Documentation/devicetree/bindings/spi/fsl-spi.txt @@ -0,0 +1,53 @@ +* SPI (Serial Peripheral Interface) + +Required properties: +- cell-index : QE SPI subblock index. + 0: QE subblock SPI1 + 1: QE subblock SPI2 +- compatible : should be "fsl,spi". +- mode : the SPI operation mode, it can be "cpu" or "cpu-qe". +- reg : Offset and length of the register set for the device +- interrupts : where a is the interrupt number and b is a + field that represents an encoding of the sense and level + information for the interrupt. This should be encoded based on + the information in section 2) depending on the type of interrupt + controller you have. +- interrupt-parent : the phandle for the interrupt controller that + services interrupts for this device. + +Optional properties: +- gpios : specifies the gpio pins to be used for chipselects. + The gpios will be referred to as reg = in the SPI child nodes. + If unspecified, a single SPI device without a chip select can be used. + +Example: + spi@4c0 { + cell-index = <0>; + compatible = "fsl,spi"; + reg = <4c0 40>; + interrupts = <82 0>; + interrupt-parent = <700>; + mode = "cpu"; + gpios = <&gpio 18 1 // device reg=<0> + &gpio 19 1>; // device reg=<1> + }; + + +* eSPI (Enhanced Serial Peripheral Interface) + +Required properties: +- compatible : should be "fsl,mpc8536-espi". +- reg : Offset and length of the register set for the device. +- interrupts : should contain eSPI interrupt, the device has one interrupt. +- fsl,espi-num-chipselects : the number of the chipselect signals. + +Example: + spi@110000 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "fsl,mpc8536-espi"; + reg = <0x110000 0x1000>; + interrupts = <53 0x2>; + interrupt-parent = <&mpic>; + fsl,espi-num-chipselects = <4>; + }; diff --git a/Documentation/devicetree/bindings/spi/spi-bus.txt b/Documentation/devicetree/bindings/spi/spi-bus.txt new file mode 100644 index 000000000000..e782add2e457 --- /dev/null +++ b/Documentation/devicetree/bindings/spi/spi-bus.txt @@ -0,0 +1,57 @@ +SPI (Serial Peripheral Interface) busses + +SPI busses can be described with a node for the SPI master device +and a set of child nodes for each SPI slave on the bus. For this +discussion, it is assumed that the system's SPI controller is in +SPI master mode. This binding does not describe SPI controllers +in slave mode. + +The SPI master node requires the following properties: +- #address-cells - number of cells required to define a chip select + address on the SPI bus. +- #size-cells - should be zero. +- compatible - name of SPI bus controller following generic names + recommended practice. +No other properties are required in the SPI bus node. It is assumed +that a driver for an SPI bus device will understand that it is an SPI bus. +However, the binding does not attempt to define the specific method for +assigning chip select numbers. Since SPI chip select configuration is +flexible and non-standardized, it is left out of this binding with the +assumption that board specific platform code will be used to manage +chip selects. Individual drivers can define additional properties to +support describing the chip select layout. + +SPI slave nodes must be children of the SPI master node and can +contain the following properties. +- reg - (required) chip select address of device. +- compatible - (required) name of SPI device following generic names + recommended practice +- spi-max-frequency - (required) Maximum SPI clocking speed of device in Hz +- spi-cpol - (optional) Empty property indicating device requires + inverse clock polarity (CPOL) mode +- spi-cpha - (optional) Empty property indicating device requires + shifted clock phase (CPHA) mode +- spi-cs-high - (optional) Empty property indicating device requires + chip select active high + +SPI example for an MPC5200 SPI bus: + spi@f00 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "fsl,mpc5200b-spi","fsl,mpc5200-spi"; + reg = <0xf00 0x20>; + interrupts = <2 13 0 2 14 0>; + interrupt-parent = <&mpc5200_pic>; + + ethernet-switch@0 { + compatible = "micrel,ks8995m"; + spi-max-frequency = <1000000>; + reg = <0>; + }; + + codec@1 { + compatible = "ti,tlv320aic26"; + spi-max-frequency = <100000>; + reg = <1>; + }; + }; diff --git a/Documentation/devicetree/bindings/usb/fsl-usb.txt b/Documentation/devicetree/bindings/usb/fsl-usb.txt new file mode 100644 index 000000000000..bd5723f0b67e --- /dev/null +++ b/Documentation/devicetree/bindings/usb/fsl-usb.txt @@ -0,0 +1,81 @@ +Freescale SOC USB controllers + +The device node for a USB controller that is part of a Freescale +SOC is as described in the document "Open Firmware Recommended +Practice : Universal Serial Bus" with the following modifications +and additions : + +Required properties : + - compatible : Should be "fsl-usb2-mph" for multi port host USB + controllers, or "fsl-usb2-dr" for dual role USB controllers + or "fsl,mpc5121-usb2-dr" for dual role USB controllers of MPC5121 + - phy_type : For multi port host USB controllers, should be one of + "ulpi", or "serial". For dual role USB controllers, should be + one of "ulpi", "utmi", "utmi_wide", or "serial". + - reg : Offset and length of the register set for the device + - port0 : boolean; if defined, indicates port0 is connected for + fsl-usb2-mph compatible controllers. Either this property or + "port1" (or both) must be defined for "fsl-usb2-mph" compatible + controllers. + - port1 : boolean; if defined, indicates port1 is connected for + fsl-usb2-mph compatible controllers. Either this property or + "port0" (or both) must be defined for "fsl-usb2-mph" compatible + controllers. + - dr_mode : indicates the working mode for "fsl-usb2-dr" compatible + controllers. Can be "host", "peripheral", or "otg". Default to + "host" if not defined for backward compatibility. + +Recommended properties : + - interrupts : where a is the interrupt number and b is a + field that represents an encoding of the sense and level + information for the interrupt. This should be encoded based on + the information in section 2) depending on the type of interrupt + controller you have. + - interrupt-parent : the phandle for the interrupt controller that + services interrupts for this device. + +Optional properties : + - fsl,invert-drvvbus : boolean; for MPC5121 USB0 only. Indicates the + port power polarity of internal PHY signal DRVVBUS is inverted. + - fsl,invert-pwr-fault : boolean; for MPC5121 USB0 only. Indicates + the PWR_FAULT signal polarity is inverted. + +Example multi port host USB controller device node : + usb@22000 { + compatible = "fsl-usb2-mph"; + reg = <22000 1000>; + #address-cells = <1>; + #size-cells = <0>; + interrupt-parent = <700>; + interrupts = <27 1>; + phy_type = "ulpi"; + port0; + port1; + }; + +Example dual role USB controller device node : + usb@23000 { + compatible = "fsl-usb2-dr"; + reg = <23000 1000>; + #address-cells = <1>; + #size-cells = <0>; + interrupt-parent = <700>; + interrupts = <26 1>; + dr_mode = "otg"; + phy = "ulpi"; + }; + +Example dual role USB controller device node for MPC5121ADS: + + usb@4000 { + compatible = "fsl,mpc5121-usb2-dr"; + reg = <0x4000 0x1000>; + #address-cells = <1>; + #size-cells = <0>; + interrupt-parent = < &ipic >; + interrupts = <44 0x8>; + dr_mode = "otg"; + phy_type = "utmi_wide"; + fsl,invert-drvvbus; + fsl,invert-pwr-fault; + }; diff --git a/Documentation/devicetree/bindings/usb/usb-ehci.txt b/Documentation/devicetree/bindings/usb/usb-ehci.txt new file mode 100644 index 000000000000..fa18612f757b --- /dev/null +++ b/Documentation/devicetree/bindings/usb/usb-ehci.txt @@ -0,0 +1,25 @@ +USB EHCI controllers + +Required properties: + - compatible : should be "usb-ehci". + - reg : should contain at least address and length of the standard EHCI + register set for the device. Optional platform-dependent registers + (debug-port or other) can be also specified here, but only after + definition of standard EHCI registers. + - interrupts : one EHCI interrupt should be described here. +If device registers are implemented in big endian mode, the device +node should have "big-endian-regs" property. +If controller implementation operates with big endian descriptors, +"big-endian-desc" property should be specified. +If both big endian registers and descriptors are used by the controller +implementation, "big-endian" property can be specified instead of having +both "big-endian-regs" and "big-endian-desc". + +Example (Sequoia 440EPx): + ehci@e0000300 { + compatible = "ibm,usb-ehci-440epx", "usb-ehci"; + interrupt-parent = <&UIC0>; + interrupts = <1a 4>; + reg = <0 e0000300 90 0 e0000390 70>; + big-endian; + }; diff --git a/Documentation/devicetree/bindings/xilinx.txt b/Documentation/devicetree/bindings/xilinx.txt new file mode 100644 index 000000000000..299d0923537b --- /dev/null +++ b/Documentation/devicetree/bindings/xilinx.txt @@ -0,0 +1,306 @@ + d) Xilinx IP cores + + The Xilinx EDK toolchain ships with a set of IP cores (devices) for use + in Xilinx Spartan and Virtex FPGAs. The devices cover the whole range + of standard device types (network, serial, etc.) and miscellaneous + devices (gpio, LCD, spi, etc). Also, since these devices are + implemented within the fpga fabric every instance of the device can be + synthesised with different options that change the behaviour. + + Each IP-core has a set of parameters which the FPGA designer can use to + control how the core is synthesized. Historically, the EDK tool would + extract the device parameters relevant to device drivers and copy them + into an 'xparameters.h' in the form of #define symbols. This tells the + device drivers how the IP cores are configured, but it requires the kernel + to be recompiled every time the FPGA bitstream is resynthesized. + + The new approach is to export the parameters into the device tree and + generate a new device tree each time the FPGA bitstream changes. The + parameters which used to be exported as #defines will now become + properties of the device node. In general, device nodes for IP-cores + will take the following form: + + (name): (generic-name)@(base-address) { + compatible = "xlnx,(ip-core-name)-(HW_VER)" + [, (list of compatible devices), ...]; + reg = <(baseaddr) (size)>; + interrupt-parent = <&interrupt-controller-phandle>; + interrupts = < ... >; + xlnx,(parameter1) = "(string-value)"; + xlnx,(parameter2) = <(int-value)>; + }; + + (generic-name): an open firmware-style name that describes the + generic class of device. Preferably, this is one word, such + as 'serial' or 'ethernet'. + (ip-core-name): the name of the ip block (given after the BEGIN + directive in system.mhs). Should be in lowercase + and all underscores '_' converted to dashes '-'. + (name): is derived from the "PARAMETER INSTANCE" value. + (parameter#): C_* parameters from system.mhs. The C_ prefix is + dropped from the parameter name, the name is converted + to lowercase and all underscore '_' characters are + converted to dashes '-'. + (baseaddr): the baseaddr parameter value (often named C_BASEADDR). + (HW_VER): from the HW_VER parameter. + (size): the address range size (often C_HIGHADDR - C_BASEADDR + 1). + + Typically, the compatible list will include the exact IP core version + followed by an older IP core version which implements the same + interface or any other device with the same interface. + + 'reg', 'interrupt-parent' and 'interrupts' are all optional properties. + + For example, the following block from system.mhs: + + BEGIN opb_uartlite + PARAMETER INSTANCE = opb_uartlite_0 + PARAMETER HW_VER = 1.00.b + PARAMETER C_BAUDRATE = 115200 + PARAMETER C_DATA_BITS = 8 + PARAMETER C_ODD_PARITY = 0 + PARAMETER C_USE_PARITY = 0 + PARAMETER C_CLK_FREQ = 50000000 + PARAMETER C_BASEADDR = 0xEC100000 + PARAMETER C_HIGHADDR = 0xEC10FFFF + BUS_INTERFACE SOPB = opb_7 + PORT OPB_Clk = CLK_50MHz + PORT Interrupt = opb_uartlite_0_Interrupt + PORT RX = opb_uartlite_0_RX + PORT TX = opb_uartlite_0_TX + PORT OPB_Rst = sys_bus_reset_0 + END + + becomes the following device tree node: + + opb_uartlite_0: serial@ec100000 { + device_type = "serial"; + compatible = "xlnx,opb-uartlite-1.00.b"; + reg = ; + interrupt-parent = <&opb_intc_0>; + interrupts = <1 0>; // got this from the opb_intc parameters + current-speed = ; // standard serial device prop + clock-frequency = ; // standard serial device prop + xlnx,data-bits = <8>; + xlnx,odd-parity = <0>; + xlnx,use-parity = <0>; + }; + + Some IP cores actually implement 2 or more logical devices. In + this case, the device should still describe the whole IP core with + a single node and add a child node for each logical device. The + ranges property can be used to translate from parent IP-core to the + registers of each device. In addition, the parent node should be + compatible with the bus type 'xlnx,compound', and should contain + #address-cells and #size-cells, as with any other bus. (Note: this + makes the assumption that both logical devices have the same bus + binding. If this is not true, then separate nodes should be used + for each logical device). The 'cell-index' property can be used to + enumerate logical devices within an IP core. For example, the + following is the system.mhs entry for the dual ps2 controller found + on the ml403 reference design. + + BEGIN opb_ps2_dual_ref + PARAMETER INSTANCE = opb_ps2_dual_ref_0 + PARAMETER HW_VER = 1.00.a + PARAMETER C_BASEADDR = 0xA9000000 + PARAMETER C_HIGHADDR = 0xA9001FFF + BUS_INTERFACE SOPB = opb_v20_0 + PORT Sys_Intr1 = ps2_1_intr + PORT Sys_Intr2 = ps2_2_intr + PORT Clkin1 = ps2_clk_rx_1 + PORT Clkin2 = ps2_clk_rx_2 + PORT Clkpd1 = ps2_clk_tx_1 + PORT Clkpd2 = ps2_clk_tx_2 + PORT Rx1 = ps2_d_rx_1 + PORT Rx2 = ps2_d_rx_2 + PORT Txpd1 = ps2_d_tx_1 + PORT Txpd2 = ps2_d_tx_2 + END + + It would result in the following device tree nodes: + + opb_ps2_dual_ref_0: opb-ps2-dual-ref@a9000000 { + #address-cells = <1>; + #size-cells = <1>; + compatible = "xlnx,compound"; + ranges = <0 a9000000 2000>; + // If this device had extra parameters, then they would + // go here. + ps2@0 { + compatible = "xlnx,opb-ps2-dual-ref-1.00.a"; + reg = <0 40>; + interrupt-parent = <&opb_intc_0>; + interrupts = <3 0>; + cell-index = <0>; + }; + ps2@1000 { + compatible = "xlnx,opb-ps2-dual-ref-1.00.a"; + reg = <1000 40>; + interrupt-parent = <&opb_intc_0>; + interrupts = <3 0>; + cell-index = <0>; + }; + }; + + Also, the system.mhs file defines bus attachments from the processor + to the devices. The device tree structure should reflect the bus + attachments. Again an example; this system.mhs fragment: + + BEGIN ppc405_virtex4 + PARAMETER INSTANCE = ppc405_0 + PARAMETER HW_VER = 1.01.a + BUS_INTERFACE DPLB = plb_v34_0 + BUS_INTERFACE IPLB = plb_v34_0 + END + + BEGIN opb_intc + PARAMETER INSTANCE = opb_intc_0 + PARAMETER HW_VER = 1.00.c + PARAMETER C_BASEADDR = 0xD1000FC0 + PARAMETER C_HIGHADDR = 0xD1000FDF + BUS_INTERFACE SOPB = opb_v20_0 + END + + BEGIN opb_uart16550 + PARAMETER INSTANCE = opb_uart16550_0 + PARAMETER HW_VER = 1.00.d + PARAMETER C_BASEADDR = 0xa0000000 + PARAMETER C_HIGHADDR = 0xa0001FFF + BUS_INTERFACE SOPB = opb_v20_0 + END + + BEGIN plb_v34 + PARAMETER INSTANCE = plb_v34_0 + PARAMETER HW_VER = 1.02.a + END + + BEGIN plb_bram_if_cntlr + PARAMETER INSTANCE = plb_bram_if_cntlr_0 + PARAMETER HW_VER = 1.00.b + PARAMETER C_BASEADDR = 0xFFFF0000 + PARAMETER C_HIGHADDR = 0xFFFFFFFF + BUS_INTERFACE SPLB = plb_v34_0 + END + + BEGIN plb2opb_bridge + PARAMETER INSTANCE = plb2opb_bridge_0 + PARAMETER HW_VER = 1.01.a + PARAMETER C_RNG0_BASEADDR = 0x20000000 + PARAMETER C_RNG0_HIGHADDR = 0x3FFFFFFF + PARAMETER C_RNG1_BASEADDR = 0x60000000 + PARAMETER C_RNG1_HIGHADDR = 0x7FFFFFFF + PARAMETER C_RNG2_BASEADDR = 0x80000000 + PARAMETER C_RNG2_HIGHADDR = 0xBFFFFFFF + PARAMETER C_RNG3_BASEADDR = 0xC0000000 + PARAMETER C_RNG3_HIGHADDR = 0xDFFFFFFF + BUS_INTERFACE SPLB = plb_v34_0 + BUS_INTERFACE MOPB = opb_v20_0 + END + + Gives this device tree (some properties removed for clarity): + + plb@0 { + #address-cells = <1>; + #size-cells = <1>; + compatible = "xlnx,plb-v34-1.02.a"; + device_type = "ibm,plb"; + ranges; // 1:1 translation + + plb_bram_if_cntrl_0: bram@ffff0000 { + reg = ; + } + + opb@20000000 { + #address-cells = <1>; + #size-cells = <1>; + ranges = <20000000 20000000 20000000 + 60000000 60000000 20000000 + 80000000 80000000 40000000 + c0000000 c0000000 20000000>; + + opb_uart16550_0: serial@a0000000 { + reg = ; + }; + + opb_intc_0: interrupt-controller@d1000fc0 { + reg = ; + }; + }; + }; + + That covers the general approach to binding xilinx IP cores into the + device tree. The following are bindings for specific devices: + + i) Xilinx ML300 Framebuffer + + Simple framebuffer device from the ML300 reference design (also on the + ML403 reference design as well as others). + + Optional properties: + - resolution = : pixel resolution of framebuffer. Some + implementations use a different resolution. + Default is + - virt-resolution = : Size of framebuffer in memory. + Default is . + - rotate-display (empty) : rotate display 180 degrees. + + ii) Xilinx SystemACE + + The Xilinx SystemACE device is used to program FPGAs from an FPGA + bitstream stored on a CF card. It can also be used as a generic CF + interface device. + + Optional properties: + - 8-bit (empty) : Set this property for SystemACE in 8 bit mode + + iii) Xilinx EMAC and Xilinx TEMAC + + Xilinx Ethernet devices. In addition to general xilinx properties + listed above, nodes for these devices should include a phy-handle + property, and may include other common network device properties + like local-mac-address. + + iv) Xilinx Uartlite + + Xilinx uartlite devices are simple fixed speed serial ports. + + Required properties: + - current-speed : Baud rate of uartlite + + v) Xilinx hwicap + + Xilinx hwicap devices provide access to the configuration logic + of the FPGA through the Internal Configuration Access Port + (ICAP). The ICAP enables partial reconfiguration of the FPGA, + readback of the configuration information, and some control over + 'warm boots' of the FPGA fabric. + + Required properties: + - xlnx,family : The family of the FPGA, necessary since the + capabilities of the underlying ICAP hardware + differ between different families. May be + 'virtex2p', 'virtex4', or 'virtex5'. + + vi) Xilinx Uart 16550 + + Xilinx UART 16550 devices are very similar to the NS16550 but with + different register spacing and an offset from the base address. + + Required properties: + - clock-frequency : Frequency of the clock input + - reg-offset : A value of 3 is required + - reg-shift : A value of 2 is required + + vii) Xilinx USB Host controller + + The Xilinx USB host controller is EHCI compatible but with a different + base address for the EHCI registers, and it is always a big-endian + USB Host controller. The hardware can be configured as high speed only, + or high speed/full speed hybrid. + + Required properties: + - xlnx,support-usb-fs: A value 0 means the core is built as high speed + only. A value 1 means the core also supports + full speed devices. + diff --git a/Documentation/devicetree/booting-without-of.txt b/Documentation/devicetree/booting-without-of.txt new file mode 100644 index 000000000000..7400d7555dc3 --- /dev/null +++ b/Documentation/devicetree/booting-without-of.txt @@ -0,0 +1,1447 @@ + Booting the Linux/ppc kernel without Open Firmware + -------------------------------------------------- + +(c) 2005 Benjamin Herrenschmidt , + IBM Corp. +(c) 2005 Becky Bruce , + Freescale Semiconductor, FSL SOC and 32-bit additions +(c) 2006 MontaVista Software, Inc. + Flash chip node definition + +Table of Contents +================= + + I - Introduction + 1) Entry point for arch/powerpc + 2) Board support + + II - The DT block format + 1) Header + 2) Device tree generalities + 3) Device tree "structure" block + 4) Device tree "strings" block + + III - Required content of the device tree + 1) Note about cells and address representation + 2) Note about "compatible" properties + 3) Note about "name" properties + 4) Note about node and property names and character set + 5) Required nodes and properties + a) The root node + b) The /cpus node + c) The /cpus/* nodes + d) the /memory node(s) + e) The /chosen node + f) the /soc node + + IV - "dtc", the device tree compiler + + V - Recommendations for a bootloader + + VI - System-on-a-chip devices and nodes + 1) Defining child nodes of an SOC + 2) Representing devices without a current OF specification + a) PHY nodes + b) Interrupt controllers + c) 4xx/Axon EMAC ethernet nodes + d) Xilinx IP cores + e) USB EHCI controllers + f) MDIO on GPIOs + g) SPI busses + + VII - Specifying interrupt information for devices + 1) interrupts property + 2) interrupt-parent property + 3) OpenPIC Interrupt Controllers + 4) ISA Interrupt Controllers + + VIII - Specifying device power management information (sleep property) + + Appendix A - Sample SOC node for MPC8540 + + +Revision Information +==================== + + May 18, 2005: Rev 0.1 - Initial draft, no chapter III yet. + + May 19, 2005: Rev 0.2 - Add chapter III and bits & pieces here or + clarifies the fact that a lot of things are + optional, the kernel only requires a very + small device tree, though it is encouraged + to provide an as complete one as possible. + + May 24, 2005: Rev 0.3 - Precise that DT block has to be in RAM + - Misc fixes + - Define version 3 and new format version 16 + for the DT block (version 16 needs kernel + patches, will be fwd separately). + String block now has a size, and full path + is replaced by unit name for more + compactness. + linux,phandle is made optional, only nodes + that are referenced by other nodes need it. + "name" property is now automatically + deduced from the unit name + + June 1, 2005: Rev 0.4 - Correct confusion between OF_DT_END and + OF_DT_END_NODE in structure definition. + - Change version 16 format to always align + property data to 4 bytes. Since tokens are + already aligned, that means no specific + required alignment between property size + and property data. The old style variable + alignment would make it impossible to do + "simple" insertion of properties using + memmove (thanks Milton for + noticing). Updated kernel patch as well + - Correct a few more alignment constraints + - Add a chapter about the device-tree + compiler and the textural representation of + the tree that can be "compiled" by dtc. + + November 21, 2005: Rev 0.5 + - Additions/generalizations for 32-bit + - Changed to reflect the new arch/powerpc + structure + - Added chapter VI + + + ToDo: + - Add some definitions of interrupt tree (simple/complex) + - Add some definitions for PCI host bridges + - Add some common address format examples + - Add definitions for standard properties and "compatible" + names for cells that are not already defined by the existing + OF spec. + - Compare FSL SOC use of PCI to standard and make sure no new + node definition required. + - Add more information about node definitions for SOC devices + that currently have no standard, like the FSL CPM. + + +I - Introduction +================ + +During the recent development of the Linux/ppc64 kernel, and more +specifically, the addition of new platform types outside of the old +IBM pSeries/iSeries pair, it was decided to enforce some strict rules +regarding the kernel entry and bootloader <-> kernel interfaces, in +order to avoid the degeneration that had become the ppc32 kernel entry +point and the way a new platform should be added to the kernel. The +legacy iSeries platform breaks those rules as it predates this scheme, +but no new board support will be accepted in the main tree that +doesn't follow them properly. In addition, since the advent of the +arch/powerpc merged architecture for ppc32 and ppc64, new 32-bit +platforms and 32-bit platforms which move into arch/powerpc will be +required to use these rules as well. + +The main requirement that will be defined in more detail below is +the presence of a device-tree whose format is defined after Open +Firmware specification. However, in order to make life easier +to embedded board vendors, the kernel doesn't require the device-tree +to represent every device in the system and only requires some nodes +and properties to be present. This will be described in detail in +section III, but, for example, the kernel does not require you to +create a node for every PCI device in the system. It is a requirement +to have a node for PCI host bridges in order to provide interrupt +routing informations and memory/IO ranges, among others. It is also +recommended to define nodes for on chip devices and other busses that +don't specifically fit in an existing OF specification. This creates a +great flexibility in the way the kernel can then probe those and match +drivers to device, without having to hard code all sorts of tables. It +also makes it more flexible for board vendors to do minor hardware +upgrades without significantly impacting the kernel code or cluttering +it with special cases. + + +1) Entry point for arch/powerpc +------------------------------- + + There is one and one single entry point to the kernel, at the start + of the kernel image. That entry point supports two calling + conventions: + + a) Boot from Open Firmware. If your firmware is compatible + with Open Firmware (IEEE 1275) or provides an OF compatible + client interface API (support for "interpret" callback of + forth words isn't required), you can enter the kernel with: + + r5 : OF callback pointer as defined by IEEE 1275 + bindings to powerpc. Only the 32-bit client interface + is currently supported + + r3, r4 : address & length of an initrd if any or 0 + + The MMU is either on or off; the kernel will run the + trampoline located in arch/powerpc/kernel/prom_init.c to + extract the device-tree and other information from open + firmware and build a flattened device-tree as described + in b). prom_init() will then re-enter the kernel using + the second method. This trampoline code runs in the + context of the firmware, which is supposed to handle all + exceptions during that time. + + b) Direct entry with a flattened device-tree block. This entry + point is called by a) after the OF trampoline and can also be + called directly by a bootloader that does not support the Open + Firmware client interface. It is also used by "kexec" to + implement "hot" booting of a new kernel from a previous + running one. This method is what I will describe in more + details in this document, as method a) is simply standard Open + Firmware, and thus should be implemented according to the + various standard documents defining it and its binding to the + PowerPC platform. The entry point definition then becomes: + + r3 : physical pointer to the device-tree block + (defined in chapter II) in RAM + + r4 : physical pointer to the kernel itself. This is + used by the assembly code to properly disable the MMU + in case you are entering the kernel with MMU enabled + and a non-1:1 mapping. + + r5 : NULL (as to differentiate with method a) + + Note about SMP entry: Either your firmware puts your other + CPUs in some sleep loop or spin loop in ROM where you can get + them out via a soft reset or some other means, in which case + you don't need to care, or you'll have to enter the kernel + with all CPUs. The way to do that with method b) will be + described in a later revision of this document. + + +2) Board support +---------------- + +64-bit kernels: + + Board supports (platforms) are not exclusive config options. An + arbitrary set of board supports can be built in a single kernel + image. The kernel will "know" what set of functions to use for a + given platform based on the content of the device-tree. Thus, you + should: + + a) add your platform support as a _boolean_ option in + arch/powerpc/Kconfig, following the example of PPC_PSERIES, + PPC_PMAC and PPC_MAPLE. The later is probably a good + example of a board support to start from. + + b) create your main platform file as + "arch/powerpc/platforms/myplatform/myboard_setup.c" and add it + to the Makefile under the condition of your CONFIG_ + option. This file will define a structure of type "ppc_md" + containing the various callbacks that the generic code will + use to get to your platform specific code + + c) Add a reference to your "ppc_md" structure in the + "machines" table in arch/powerpc/kernel/setup_64.c if you are + a 64-bit platform. + + d) request and get assigned a platform number (see PLATFORM_* + constants in arch/powerpc/include/asm/processor.h + +32-bit embedded kernels: + + Currently, board support is essentially an exclusive config option. + The kernel is configured for a single platform. Part of the reason + for this is to keep kernels on embedded systems small and efficient; + part of this is due to the fact the code is already that way. In the + future, a kernel may support multiple platforms, but only if the + platforms feature the same core architecture. A single kernel build + cannot support both configurations with Book E and configurations + with classic Powerpc architectures. + + 32-bit embedded platforms that are moved into arch/powerpc using a + flattened device tree should adopt the merged tree practice of + setting ppc_md up dynamically, even though the kernel is currently + built with support for only a single platform at a time. This allows + unification of the setup code, and will make it easier to go to a + multiple-platform-support model in the future. + +NOTE: I believe the above will be true once Ben's done with the merge +of the boot sequences.... someone speak up if this is wrong! + + To add a 32-bit embedded platform support, follow the instructions + for 64-bit platforms above, with the exception that the Kconfig + option should be set up such that the kernel builds exclusively for + the platform selected. The processor type for the platform should + enable another config option to select the specific board + supported. + +NOTE: If Ben doesn't merge the setup files, may need to change this to +point to setup_32.c + + + I will describe later the boot process and various callbacks that + your platform should implement. + + +II - The DT block format +======================== + + +This chapter defines the actual format of the flattened device-tree +passed to the kernel. The actual content of it and kernel requirements +are described later. You can find example of code manipulating that +format in various places, including arch/powerpc/kernel/prom_init.c +which will generate a flattened device-tree from the Open Firmware +representation, or the fs2dt utility which is part of the kexec tools +which will generate one from a filesystem representation. It is +expected that a bootloader like uboot provides a bit more support, +that will be discussed later as well. + +Note: The block has to be in main memory. It has to be accessible in +both real mode and virtual mode with no mapping other than main +memory. If you are writing a simple flash bootloader, it should copy +the block to RAM before passing it to the kernel. + + +1) Header +--------- + + The kernel is entered with r3 pointing to an area of memory that is + roughly described in arch/powerpc/include/asm/prom.h by the structure + boot_param_header: + +struct boot_param_header { + u32 magic; /* magic word OF_DT_HEADER */ + u32 totalsize; /* total size of DT block */ + u32 off_dt_struct; /* offset to structure */ + u32 off_dt_strings; /* offset to strings */ + u32 off_mem_rsvmap; /* offset to memory reserve map + */ + u32 version; /* format version */ + u32 last_comp_version; /* last compatible version */ + + /* version 2 fields below */ + u32 boot_cpuid_phys; /* Which physical CPU id we're + booting on */ + /* version 3 fields below */ + u32 size_dt_strings; /* size of the strings block */ + + /* version 17 fields below */ + u32 size_dt_struct; /* size of the DT structure block */ +}; + + Along with the constants: + +/* Definitions used by the flattened device tree */ +#define OF_DT_HEADER 0xd00dfeed /* 4: version, + 4: total size */ +#define OF_DT_BEGIN_NODE 0x1 /* Start node: full name + */ +#define OF_DT_END_NODE 0x2 /* End node */ +#define OF_DT_PROP 0x3 /* Property: name off, + size, content */ +#define OF_DT_END 0x9 + + All values in this header are in big endian format, the various + fields in this header are defined more precisely below. All + "offset" values are in bytes from the start of the header; that is + from the value of r3. + + - magic + + This is a magic value that "marks" the beginning of the + device-tree block header. It contains the value 0xd00dfeed and is + defined by the constant OF_DT_HEADER + + - totalsize + + This is the total size of the DT block including the header. The + "DT" block should enclose all data structures defined in this + chapter (who are pointed to by offsets in this header). That is, + the device-tree structure, strings, and the memory reserve map. + + - off_dt_struct + + This is an offset from the beginning of the header to the start + of the "structure" part the device tree. (see 2) device tree) + + - off_dt_strings + + This is an offset from the beginning of the header to the start + of the "strings" part of the device-tree + + - off_mem_rsvmap + + This is an offset from the beginning of the header to the start + of the reserved memory map. This map is a list of pairs of 64- + bit integers. Each pair is a physical address and a size. The + list is terminated by an entry of size 0. This map provides the + kernel with a list of physical memory areas that are "reserved" + and thus not to be used for memory allocations, especially during + early initialization. The kernel needs to allocate memory during + boot for things like un-flattening the device-tree, allocating an + MMU hash table, etc... Those allocations must be done in such a + way to avoid overriding critical things like, on Open Firmware + capable machines, the RTAS instance, or on some pSeries, the TCE + tables used for the iommu. Typically, the reserve map should + contain _at least_ this DT block itself (header,total_size). If + you are passing an initrd to the kernel, you should reserve it as + well. You do not need to reserve the kernel image itself. The map + should be 64-bit aligned. + + - version + + This is the version of this structure. Version 1 stops + here. Version 2 adds an additional field boot_cpuid_phys. + Version 3 adds the size of the strings block, allowing the kernel + to reallocate it easily at boot and free up the unused flattened + structure after expansion. Version 16 introduces a new more + "compact" format for the tree itself that is however not backward + compatible. Version 17 adds an additional field, size_dt_struct, + allowing it to be reallocated or moved more easily (this is + particularly useful for bootloaders which need to make + adjustments to a device tree based on probed information). You + should always generate a structure of the highest version defined + at the time of your implementation. Currently that is version 17, + unless you explicitly aim at being backward compatible. + + - last_comp_version + + Last compatible version. This indicates down to what version of + the DT block you are backward compatible. For example, version 2 + is backward compatible with version 1 (that is, a kernel build + for version 1 will be able to boot with a version 2 format). You + should put a 1 in this field if you generate a device tree of + version 1 to 3, or 16 if you generate a tree of version 16 or 17 + using the new unit name format. + + - boot_cpuid_phys + + This field only exist on version 2 headers. It indicate which + physical CPU ID is calling the kernel entry point. This is used, + among others, by kexec. If you are on an SMP system, this value + should match the content of the "reg" property of the CPU node in + the device-tree corresponding to the CPU calling the kernel entry + point (see further chapters for more informations on the required + device-tree contents) + + - size_dt_strings + + This field only exists on version 3 and later headers. It + gives the size of the "strings" section of the device tree (which + starts at the offset given by off_dt_strings). + + - size_dt_struct + + This field only exists on version 17 and later headers. It gives + the size of the "structure" section of the device tree (which + starts at the offset given by off_dt_struct). + + So the typical layout of a DT block (though the various parts don't + need to be in that order) looks like this (addresses go from top to + bottom): + + + ------------------------------ + r3 -> | struct boot_param_header | + ------------------------------ + | (alignment gap) (*) | + ------------------------------ + | memory reserve map | + ------------------------------ + | (alignment gap) | + ------------------------------ + | | + | device-tree structure | + | | + ------------------------------ + | (alignment gap) | + ------------------------------ + | | + | device-tree strings | + | | + -----> ------------------------------ + | + | + --- (r3 + totalsize) + + (*) The alignment gaps are not necessarily present; their presence + and size are dependent on the various alignment requirements of + the individual data blocks. + + +2) Device tree generalities +--------------------------- + +This device-tree itself is separated in two different blocks, a +structure block and a strings block. Both need to be aligned to a 4 +byte boundary. + +First, let's quickly describe the device-tree concept before detailing +the storage format. This chapter does _not_ describe the detail of the +required types of nodes & properties for the kernel, this is done +later in chapter III. + +The device-tree layout is strongly inherited from the definition of +the Open Firmware IEEE 1275 device-tree. It's basically a tree of +nodes, each node having two or more named properties. A property can +have a value or not. + +It is a tree, so each node has one and only one parent except for the +root node who has no parent. + +A node has 2 names. The actual node name is generally contained in a +property of type "name" in the node property list whose value is a +zero terminated string and is mandatory for version 1 to 3 of the +format definition (as it is in Open Firmware). Version 16 makes it +optional as it can generate it from the unit name defined below. + +There is also a "unit name" that is used to differentiate nodes with +the same name at the same level, it is usually made of the node +names, the "@" sign, and a "unit address", which definition is +specific to the bus type the node sits on. + +The unit name doesn't exist as a property per-se but is included in +the device-tree structure. It is typically used to represent "path" in +the device-tree. More details about the actual format of these will be +below. + +The kernel powerpc generic code does not make any formal use of the +unit address (though some board support code may do) so the only real +requirement here for the unit address is to ensure uniqueness of +the node unit name at a given level of the tree. Nodes with no notion +of address and no possible sibling of the same name (like /memory or +/cpus) may omit the unit address in the context of this specification, +or use the "@0" default unit address. The unit name is used to define +a node "full path", which is the concatenation of all parent node +unit names separated with "/". + +The root node doesn't have a defined name, and isn't required to have +a name property either if you are using version 3 or earlier of the +format. It also has no unit address (no @ symbol followed by a unit +address). The root node unit name is thus an empty string. The full +path to the root node is "/". + +Every node which actually represents an actual device (that is, a node +which isn't only a virtual "container" for more nodes, like "/cpus" +is) is also required to have a "device_type" property indicating the +type of node . + +Finally, every node that can be referenced from a property in another +node is required to have a "linux,phandle" property. Real open +firmware implementations provide a unique "phandle" value for every +node that the "prom_init()" trampoline code turns into +"linux,phandle" properties. However, this is made optional if the +flattened device tree is used directly. An example of a node +referencing another node via "phandle" is when laying out the +interrupt tree which will be described in a further version of this +document. + +This "linux, phandle" property is a 32-bit value that uniquely +identifies a node. You are free to use whatever values or system of +values, internal pointers, or whatever to generate these, the only +requirement is that every node for which you provide that property has +a unique value for it. + +Here is an example of a simple device-tree. In this example, an "o" +designates a node followed by the node unit name. Properties are +presented with their name followed by their content. "content" +represents an ASCII string (zero terminated) value, while +represents a 32-bit hexadecimal value. The various nodes in this +example will be discussed in a later chapter. At this point, it is +only meant to give you a idea of what a device-tree looks like. I have +purposefully kept the "name" and "linux,phandle" properties which +aren't necessary in order to give you a better idea of what the tree +looks like in practice. + + / o device-tree + |- name = "device-tree" + |- model = "MyBoardName" + |- compatible = "MyBoardFamilyName" + |- #address-cells = <2> + |- #size-cells = <2> + |- linux,phandle = <0> + | + o cpus + | | - name = "cpus" + | | - linux,phandle = <1> + | | - #address-cells = <1> + | | - #size-cells = <0> + | | + | o PowerPC,970@0 + | |- name = "PowerPC,970" + | |- device_type = "cpu" + | |- reg = <0> + | |- clock-frequency = <5f5e1000> + | |- 64-bit + | |- linux,phandle = <2> + | + o memory@0 + | |- name = "memory" + | |- device_type = "memory" + | |- reg = <00000000 00000000 00000000 20000000> + | |- linux,phandle = <3> + | + o chosen + |- name = "chosen" + |- bootargs = "root=/dev/sda2" + |- linux,phandle = <4> + +This tree is almost a minimal tree. It pretty much contains the +minimal set of required nodes and properties to boot a linux kernel; +that is, some basic model informations at the root, the CPUs, and the +physical memory layout. It also includes misc information passed +through /chosen, like in this example, the platform type (mandatory) +and the kernel command line arguments (optional). + +The /cpus/PowerPC,970@0/64-bit property is an example of a +property without a value. All other properties have a value. The +significance of the #address-cells and #size-cells properties will be +explained in chapter IV which defines precisely the required nodes and +properties and their content. + + +3) Device tree "structure" block + +The structure of the device tree is a linearized tree structure. The +"OF_DT_BEGIN_NODE" token starts a new node, and the "OF_DT_END_NODE" +ends that node definition. Child nodes are simply defined before +"OF_DT_END_NODE" (that is nodes within the node). A 'token' is a 32 +bit value. The tree has to be "finished" with a OF_DT_END token + +Here's the basic structure of a single node: + + * token OF_DT_BEGIN_NODE (that is 0x00000001) + * for version 1 to 3, this is the node full path as a zero + terminated string, starting with "/". For version 16 and later, + this is the node unit name only (or an empty string for the + root node) + * [align gap to next 4 bytes boundary] + * for each property: + * token OF_DT_PROP (that is 0x00000003) + * 32-bit value of property value size in bytes (or 0 if no + value) + * 32-bit value of offset in string block of property name + * property value data if any + * [align gap to next 4 bytes boundary] + * [child nodes if any] + * token OF_DT_END_NODE (that is 0x00000002) + +So the node content can be summarized as a start token, a full path, +a list of properties, a list of child nodes, and an end token. Every +child node is a full node structure itself as defined above. + +NOTE: The above definition requires that all property definitions for +a particular node MUST precede any subnode definitions for that node. +Although the structure would not be ambiguous if properties and +subnodes were intermingled, the kernel parser requires that the +properties come first (up until at least 2.6.22). Any tools +manipulating a flattened tree must take care to preserve this +constraint. + +4) Device tree "strings" block + +In order to save space, property names, which are generally redundant, +are stored separately in the "strings" block. This block is simply the +whole bunch of zero terminated strings for all property names +concatenated together. The device-tree property definitions in the +structure block will contain offset values from the beginning of the +strings block. + + +III - Required content of the device tree +========================================= + +WARNING: All "linux,*" properties defined in this document apply only +to a flattened device-tree. If your platform uses a real +implementation of Open Firmware or an implementation compatible with +the Open Firmware client interface, those properties will be created +by the trampoline code in the kernel's prom_init() file. For example, +that's where you'll have to add code to detect your board model and +set the platform number. However, when using the flattened device-tree +entry point, there is no prom_init() pass, and thus you have to +provide those properties yourself. + + +1) Note about cells and address representation +---------------------------------------------- + +The general rule is documented in the various Open Firmware +documentations. If you choose to describe a bus with the device-tree +and there exist an OF bus binding, then you should follow the +specification. However, the kernel does not require every single +device or bus to be described by the device tree. + +In general, the format of an address for a device is defined by the +parent bus type, based on the #address-cells and #size-cells +properties. Note that the parent's parent definitions of #address-cells +and #size-cells are not inherited so every node with children must specify +them. The kernel requires the root node to have those properties defining +addresses format for devices directly mapped on the processor bus. + +Those 2 properties define 'cells' for representing an address and a +size. A "cell" is a 32-bit number. For example, if both contain 2 +like the example tree given above, then an address and a size are both +composed of 2 cells, and each is a 64-bit number (cells are +concatenated and expected to be in big endian format). Another example +is the way Apple firmware defines them, with 2 cells for an address +and one cell for a size. Most 32-bit implementations should define +#address-cells and #size-cells to 1, which represents a 32-bit value. +Some 32-bit processors allow for physical addresses greater than 32 +bits; these processors should define #address-cells as 2. + +"reg" properties are always a tuple of the type "address size" where +the number of cells of address and size is specified by the bus +#address-cells and #size-cells. When a bus supports various address +spaces and other flags relative to a given address allocation (like +prefetchable, etc...) those flags are usually added to the top level +bits of the physical address. For example, a PCI physical address is +made of 3 cells, the bottom two containing the actual address itself +while the top cell contains address space indication, flags, and pci +bus & device numbers. + +For busses that support dynamic allocation, it's the accepted practice +to then not provide the address in "reg" (keep it 0) though while +providing a flag indicating the address is dynamically allocated, and +then, to provide a separate "assigned-addresses" property that +contains the fully allocated addresses. See the PCI OF bindings for +details. + +In general, a simple bus with no address space bits and no dynamic +allocation is preferred if it reflects your hardware, as the existing +kernel address parsing functions will work out of the box. If you +define a bus type with a more complex address format, including things +like address space bits, you'll have to add a bus translator to the +prom_parse.c file of the recent kernels for your bus type. + +The "reg" property only defines addresses and sizes (if #size-cells is +non-0) within a given bus. In order to translate addresses upward +(that is into parent bus addresses, and possibly into CPU physical +addresses), all busses must contain a "ranges" property. If the +"ranges" property is missing at a given level, it's assumed that +translation isn't possible, i.e., the registers are not visible on the +parent bus. The format of the "ranges" property for a bus is a list +of: + + bus address, parent bus address, size + +"bus address" is in the format of the bus this bus node is defining, +that is, for a PCI bridge, it would be a PCI address. Thus, (bus +address, size) defines a range of addresses for child devices. "parent +bus address" is in the format of the parent bus of this bus. For +example, for a PCI host controller, that would be a CPU address. For a +PCI<->ISA bridge, that would be a PCI address. It defines the base +address in the parent bus where the beginning of that range is mapped. + +For a new 64-bit powerpc board, I recommend either the 2/2 format or +Apple's 2/1 format which is slightly more compact since sizes usually +fit in a single 32-bit word. New 32-bit powerpc boards should use a +1/1 format, unless the processor supports physical addresses greater +than 32-bits, in which case a 2/1 format is recommended. + +Alternatively, the "ranges" property may be empty, indicating that the +registers are visible on the parent bus using an identity mapping +translation. In other words, the parent bus address space is the same +as the child bus address space. + +2) Note about "compatible" properties +------------------------------------- + +These properties are optional, but recommended in devices and the root +node. The format of a "compatible" property is a list of concatenated +zero terminated strings. They allow a device to express its +compatibility with a family of similar devices, in some cases, +allowing a single driver to match against several devices regardless +of their actual names. + +3) Note about "name" properties +------------------------------- + +While earlier users of Open Firmware like OldWorld macintoshes tended +to use the actual device name for the "name" property, it's nowadays +considered a good practice to use a name that is closer to the device +class (often equal to device_type). For example, nowadays, ethernet +controllers are named "ethernet", an additional "model" property +defining precisely the chip type/model, and "compatible" property +defining the family in case a single driver can driver more than one +of these chips. However, the kernel doesn't generally put any +restriction on the "name" property; it is simply considered good +practice to follow the standard and its evolutions as closely as +possible. + +Note also that the new format version 16 makes the "name" property +optional. If it's absent for a node, then the node's unit name is then +used to reconstruct the name. That is, the part of the unit name +before the "@" sign is used (or the entire unit name if no "@" sign +is present). + +4) Note about node and property names and character set +------------------------------------------------------- + +While open firmware provides more flexible usage of 8859-1, this +specification enforces more strict rules. Nodes and properties should +be comprised only of ASCII characters 'a' to 'z', '0' to +'9', ',', '.', '_', '+', '#', '?', and '-'. Node names additionally +allow uppercase characters 'A' to 'Z' (property names should be +lowercase. The fact that vendors like Apple don't respect this rule is +irrelevant here). Additionally, node and property names should always +begin with a character in the range 'a' to 'z' (or 'A' to 'Z' for node +names). + +The maximum number of characters for both nodes and property names +is 31. In the case of node names, this is only the leftmost part of +a unit name (the pure "name" property), it doesn't include the unit +address which can extend beyond that limit. + + +5) Required nodes and properties +-------------------------------- + These are all that are currently required. However, it is strongly + recommended that you expose PCI host bridges as documented in the + PCI binding to open firmware, and your interrupt tree as documented + in OF interrupt tree specification. + + a) The root node + + The root node requires some properties to be present: + + - model : this is your board name/model + - #address-cells : address representation for "root" devices + - #size-cells: the size representation for "root" devices + - device_type : This property shouldn't be necessary. However, if + you decide to create a device_type for your root node, make sure it + is _not_ "chrp" unless your platform is a pSeries or PAPR compliant + one for 64-bit, or a CHRP-type machine for 32-bit as this will + matched by the kernel this way. + + Additionally, some recommended properties are: + + - compatible : the board "family" generally finds its way here, + for example, if you have 2 board models with a similar layout, + that typically get driven by the same platform code in the + kernel, you would use a different "model" property but put a + value in "compatible". The kernel doesn't directly use that + value but it is generally useful. + + The root node is also generally where you add additional properties + specific to your board like the serial number if any, that sort of + thing. It is recommended that if you add any "custom" property whose + name may clash with standard defined ones, you prefix them with your + vendor name and a comma. + + b) The /cpus node + + This node is the parent of all individual CPU nodes. It doesn't + have any specific requirements, though it's generally good practice + to have at least: + + #address-cells = <00000001> + #size-cells = <00000000> + + This defines that the "address" for a CPU is a single cell, and has + no meaningful size. This is not necessary but the kernel will assume + that format when reading the "reg" properties of a CPU node, see + below + + c) The /cpus/* nodes + + So under /cpus, you are supposed to create a node for every CPU on + the machine. There is no specific restriction on the name of the + CPU, though It's common practice to call it PowerPC,. For + example, Apple uses PowerPC,G5 while IBM uses PowerPC,970FX. + + Required properties: + + - device_type : has to be "cpu" + - reg : This is the physical CPU number, it's a single 32-bit cell + and is also used as-is as the unit number for constructing the + unit name in the full path. For example, with 2 CPUs, you would + have the full path: + /cpus/PowerPC,970FX@0 + /cpus/PowerPC,970FX@1 + (unit addresses do not require leading zeroes) + - d-cache-block-size : one cell, L1 data cache block size in bytes (*) + - i-cache-block-size : one cell, L1 instruction cache block size in + bytes + - d-cache-size : one cell, size of L1 data cache in bytes + - i-cache-size : one cell, size of L1 instruction cache in bytes + +(*) The cache "block" size is the size on which the cache management +instructions operate. Historically, this document used the cache +"line" size here which is incorrect. The kernel will prefer the cache +block size and will fallback to cache line size for backward +compatibility. + + Recommended properties: + + - timebase-frequency : a cell indicating the frequency of the + timebase in Hz. This is not directly used by the generic code, + but you are welcome to copy/paste the pSeries code for setting + the kernel timebase/decrementer calibration based on this + value. + - clock-frequency : a cell indicating the CPU core clock frequency + in Hz. A new property will be defined for 64-bit values, but if + your frequency is < 4Ghz, one cell is enough. Here as well as + for the above, the common code doesn't use that property, but + you are welcome to re-use the pSeries or Maple one. A future + kernel version might provide a common function for this. + - d-cache-line-size : one cell, L1 data cache line size in bytes + if different from the block size + - i-cache-line-size : one cell, L1 instruction cache line size in + bytes if different from the block size + + You are welcome to add any property you find relevant to your board, + like some information about the mechanism used to soft-reset the + CPUs. For example, Apple puts the GPIO number for CPU soft reset + lines in there as a "soft-reset" property since they start secondary + CPUs by soft-resetting them. + + + d) the /memory node(s) + + To define the physical memory layout of your board, you should + create one or more memory node(s). You can either create a single + node with all memory ranges in its reg property, or you can create + several nodes, as you wish. The unit address (@ part) used for the + full path is the address of the first range of memory defined by a + given node. If you use a single memory node, this will typically be + @0. + + Required properties: + + - device_type : has to be "memory" + - reg : This property contains all the physical memory ranges of + your board. It's a list of addresses/sizes concatenated + together, with the number of cells of each defined by the + #address-cells and #size-cells of the root node. For example, + with both of these properties being 2 like in the example given + earlier, a 970 based machine with 6Gb of RAM could typically + have a "reg" property here that looks like: + + 00000000 00000000 00000000 80000000 + 00000001 00000000 00000001 00000000 + + That is a range starting at 0 of 0x80000000 bytes and a range + starting at 0x100000000 and of 0x100000000 bytes. You can see + that there is no memory covering the IO hole between 2Gb and + 4Gb. Some vendors prefer splitting those ranges into smaller + segments, but the kernel doesn't care. + + e) The /chosen node + + This node is a bit "special". Normally, that's where open firmware + puts some variable environment information, like the arguments, or + the default input/output devices. + + This specification makes a few of these mandatory, but also defines + some linux-specific properties that would be normally constructed by + the prom_init() trampoline when booting with an OF client interface, + but that you have to provide yourself when using the flattened format. + + Recommended properties: + + - bootargs : This zero-terminated string is passed as the kernel + command line + - linux,stdout-path : This is the full path to your standard + console device if any. Typically, if you have serial devices on + your board, you may want to put the full path to the one set as + the default console in the firmware here, for the kernel to pick + it up as its own default console. If you look at the function + set_preferred_console() in arch/ppc64/kernel/setup.c, you'll see + that the kernel tries to find out the default console and has + knowledge of various types like 8250 serial ports. You may want + to extend this function to add your own. + + Note that u-boot creates and fills in the chosen node for platforms + that use it. + + (Note: a practice that is now obsolete was to include a property + under /chosen called interrupt-controller which had a phandle value + that pointed to the main interrupt controller) + + f) the /soc node + + This node is used to represent a system-on-a-chip (SOC) and must be + present if the processor is a SOC. The top-level soc node contains + information that is global to all devices on the SOC. The node name + should contain a unit address for the SOC, which is the base address + of the memory-mapped register set for the SOC. The name of an soc + node should start with "soc", and the remainder of the name should + represent the part number for the soc. For example, the MPC8540's + soc node would be called "soc8540". + + Required properties: + + - device_type : Should be "soc" + - ranges : Should be defined as specified in 1) to describe the + translation of SOC addresses for memory mapped SOC registers. + - bus-frequency: Contains the bus frequency for the SOC node. + Typically, the value of this field is filled in by the boot + loader. + + + Recommended properties: + + - reg : This property defines the address and size of the + memory-mapped registers that are used for the SOC node itself. + It does not include the child device registers - these will be + defined inside each child node. The address specified in the + "reg" property should match the unit address of the SOC node. + - #address-cells : Address representation for "soc" devices. The + format of this field may vary depending on whether or not the + device registers are memory mapped. For memory mapped + registers, this field represents the number of cells needed to + represent the address of the registers. For SOCs that do not + use MMIO, a special address format should be defined that + contains enough cells to represent the required information. + See 1) above for more details on defining #address-cells. + - #size-cells : Size representation for "soc" devices + - #interrupt-cells : Defines the width of cells used to represent + interrupts. Typically this value is <2>, which includes a + 32-bit number that represents the interrupt number, and a + 32-bit number that represents the interrupt sense and level. + This field is only needed if the SOC contains an interrupt + controller. + + The SOC node may contain child nodes for each SOC device that the + platform uses. Nodes should not be created for devices which exist + on the SOC but are not used by a particular platform. See chapter VI + for more information on how to specify devices that are part of a SOC. + + Example SOC node for the MPC8540: + + soc8540@e0000000 { + #address-cells = <1>; + #size-cells = <1>; + #interrupt-cells = <2>; + device_type = "soc"; + ranges = <00000000 e0000000 00100000> + reg = ; + bus-frequency = <0>; + } + + + +IV - "dtc", the device tree compiler +==================================== + + +dtc source code can be found at + + +WARNING: This version is still in early development stage; the +resulting device-tree "blobs" have not yet been validated with the +kernel. The current generated block lacks a useful reserve map (it will +be fixed to generate an empty one, it's up to the bootloader to fill +it up) among others. The error handling needs work, bugs are lurking, +etc... + +dtc basically takes a device-tree in a given format and outputs a +device-tree in another format. The currently supported formats are: + + Input formats: + ------------- + + - "dtb": "blob" format, that is a flattened device-tree block + with + header all in a binary blob. + - "dts": "source" format. This is a text file containing a + "source" for a device-tree. The format is defined later in this + chapter. + - "fs" format. This is a representation equivalent to the + output of /proc/device-tree, that is nodes are directories and + properties are files + + Output formats: + --------------- + + - "dtb": "blob" format + - "dts": "source" format + - "asm": assembly language file. This is a file that can be + sourced by gas to generate a device-tree "blob". That file can + then simply be added to your Makefile. Additionally, the + assembly file exports some symbols that can be used. + + +The syntax of the dtc tool is + + dtc [-I ] [-O ] + [-o output-filename] [-V output_version] input_filename + + +The "output_version" defines what version of the "blob" format will be +generated. Supported versions are 1,2,3 and 16. The default is +currently version 3 but that may change in the future to version 16. + +Additionally, dtc performs various sanity checks on the tree, like the +uniqueness of linux, phandle properties, validity of strings, etc... + +The format of the .dts "source" file is "C" like, supports C and C++ +style comments. + +/ { +} + +The above is the "device-tree" definition. It's the only statement +supported currently at the toplevel. + +/ { + property1 = "string_value"; /* define a property containing a 0 + * terminated string + */ + + property2 = <1234abcd>; /* define a property containing a + * numerical 32-bit value (hexadecimal) + */ + + property3 = <12345678 12345678 deadbeef>; + /* define a property containing 3 + * numerical 32-bit values (cells) in + * hexadecimal + */ + property4 = [0a 0b 0c 0d de ea ad be ef]; + /* define a property whose content is + * an arbitrary array of bytes + */ + + childnode@address { /* define a child node named "childnode" + * whose unit name is "childnode at + * address" + */ + + childprop = "hello\n"; /* define a property "childprop" of + * childnode (in this case, a string) + */ + }; +}; + +Nodes can contain other nodes etc... thus defining the hierarchical +structure of the tree. + +Strings support common escape sequences from C: "\n", "\t", "\r", +"\(octal value)", "\x(hex value)". + +It is also suggested that you pipe your source file through cpp (gcc +preprocessor) so you can use #include's, #define for constants, etc... + +Finally, various options are planned but not yet implemented, like +automatic generation of phandles, labels (exported to the asm file so +you can point to a property content and change it easily from whatever +you link the device-tree with), label or path instead of numeric value +in some cells to "point" to a node (replaced by a phandle at compile +time), export of reserve map address to the asm file, ability to +specify reserve map content at compile time, etc... + +We may provide a .h include file with common definitions of that +proves useful for some properties (like building PCI properties or +interrupt maps) though it may be better to add a notion of struct +definitions to the compiler... + + +V - Recommendations for a bootloader +==================================== + + +Here are some various ideas/recommendations that have been proposed +while all this has been defined and implemented. + + - The bootloader may want to be able to use the device-tree itself + and may want to manipulate it (to add/edit some properties, + like physical memory size or kernel arguments). At this point, 2 + choices can be made. Either the bootloader works directly on the + flattened format, or the bootloader has its own internal tree + representation with pointers (similar to the kernel one) and + re-flattens the tree when booting the kernel. The former is a bit + more difficult to edit/modify, the later requires probably a bit + more code to handle the tree structure. Note that the structure + format has been designed so it's relatively easy to "insert" + properties or nodes or delete them by just memmoving things + around. It contains no internal offsets or pointers for this + purpose. + + - An example of code for iterating nodes & retrieving properties + directly from the flattened tree format can be found in the kernel + file arch/ppc64/kernel/prom.c, look at scan_flat_dt() function, + its usage in early_init_devtree(), and the corresponding various + early_init_dt_scan_*() callbacks. That code can be re-used in a + GPL bootloader, and as the author of that code, I would be happy + to discuss possible free licensing to any vendor who wishes to + integrate all or part of this code into a non-GPL bootloader. + + + +VI - System-on-a-chip devices and nodes +======================================= + +Many companies are now starting to develop system-on-a-chip +processors, where the processor core (CPU) and many peripheral devices +exist on a single piece of silicon. For these SOCs, an SOC node +should be used that defines child nodes for the devices that make +up the SOC. While platforms are not required to use this model in +order to boot the kernel, it is highly encouraged that all SOC +implementations define as complete a flat-device-tree as possible to +describe the devices on the SOC. This will allow for the +genericization of much of the kernel code. + + +1) Defining child nodes of an SOC +--------------------------------- + +Each device that is part of an SOC may have its own node entry inside +the SOC node. For each device that is included in the SOC, the unit +address property represents the address offset for this device's +memory-mapped registers in the parent's address space. The parent's +address space is defined by the "ranges" property in the top-level soc +node. The "reg" property for each node that exists directly under the +SOC node should contain the address mapping from the child address space +to the parent SOC address space and the size of the device's +memory-mapped register file. + +For many devices that may exist inside an SOC, there are predefined +specifications for the format of the device tree node. All SOC child +nodes should follow these specifications, except where noted in this +document. + +See appendix A for an example partial SOC node definition for the +MPC8540. + + +2) Representing devices without a current OF specification +---------------------------------------------------------- + +Currently, there are many devices on SOCs that do not have a standard +representation pre-defined as part of the open firmware +specifications, mainly because the boards that contain these SOCs are +not currently booted using open firmware. This section contains +descriptions for the SOC devices for which new nodes have been +defined; this list will expand as more and more SOC-containing +platforms are moved over to use the flattened-device-tree model. + +VII - Specifying interrupt information for devices +=================================================== + +The device tree represents the busses and devices of a hardware +system in a form similar to the physical bus topology of the +hardware. + +In addition, a logical 'interrupt tree' exists which represents the +hierarchy and routing of interrupts in the hardware. + +The interrupt tree model is fully described in the +document "Open Firmware Recommended Practice: Interrupt +Mapping Version 0.9". The document is available at: +. + +1) interrupts property +---------------------- + +Devices that generate interrupts to a single interrupt controller +should use the conventional OF representation described in the +OF interrupt mapping documentation. + +Each device which generates interrupts must have an 'interrupt' +property. The interrupt property value is an arbitrary number of +of 'interrupt specifier' values which describe the interrupt or +interrupts for the device. + +The encoding of an interrupt specifier is determined by the +interrupt domain in which the device is located in the +interrupt tree. The root of an interrupt domain specifies in +its #interrupt-cells property the number of 32-bit cells +required to encode an interrupt specifier. See the OF interrupt +mapping documentation for a detailed description of domains. + +For example, the binding for the OpenPIC interrupt controller +specifies an #interrupt-cells value of 2 to encode the interrupt +number and level/sense information. All interrupt children in an +OpenPIC interrupt domain use 2 cells per interrupt in their interrupts +property. + +The PCI bus binding specifies a #interrupt-cell value of 1 to encode +which interrupt pin (INTA,INTB,INTC,INTD) is used. + +2) interrupt-parent property +---------------------------- + +The interrupt-parent property is specified to define an explicit +link between a device node and its interrupt parent in +the interrupt tree. The value of interrupt-parent is the +phandle of the parent node. + +If the interrupt-parent property is not defined for a node, its +interrupt parent is assumed to be an ancestor in the node's +_device tree_ hierarchy. + +3) OpenPIC Interrupt Controllers +-------------------------------- + +OpenPIC interrupt controllers require 2 cells to encode +interrupt information. The first cell defines the interrupt +number. The second cell defines the sense and level +information. + +Sense and level information should be encoded as follows: + + 0 = low to high edge sensitive type enabled + 1 = active low level sensitive type enabled + 2 = active high level sensitive type enabled + 3 = high to low edge sensitive type enabled + +4) ISA Interrupt Controllers +---------------------------- + +ISA PIC interrupt controllers require 2 cells to encode +interrupt information. The first cell defines the interrupt +number. The second cell defines the sense and level +information. + +ISA PIC interrupt controllers should adhere to the ISA PIC +encodings listed below: + + 0 = active low level sensitive type enabled + 1 = active high level sensitive type enabled + 2 = high to low edge sensitive type enabled + 3 = low to high edge sensitive type enabled + +VIII - Specifying Device Power Management Information (sleep property) +=================================================================== + +Devices on SOCs often have mechanisms for placing devices into low-power +states that are decoupled from the devices' own register blocks. Sometimes, +this information is more complicated than a cell-index property can +reasonably describe. Thus, each device controlled in such a manner +may contain a "sleep" property which describes these connections. + +The sleep property consists of one or more sleep resources, each of +which consists of a phandle to a sleep controller, followed by a +controller-specific sleep specifier of zero or more cells. + +The semantics of what type of low power modes are possible are defined +by the sleep controller. Some examples of the types of low power modes +that may be supported are: + + - Dynamic: The device may be disabled or enabled at any time. + - System Suspend: The device may request to be disabled or remain + awake during system suspend, but will not be disabled until then. + - Permanent: The device is disabled permanently (until the next hard + reset). + +Some devices may share a clock domain with each other, such that they should +only be suspended when none of the devices are in use. Where reasonable, +such nodes should be placed on a virtual bus, where the bus has the sleep +property. If the clock domain is shared among devices that cannot be +reasonably grouped in this manner, then create a virtual sleep controller +(similar to an interrupt nexus, except that defining a standardized +sleep-map should wait until its necessity is demonstrated). + +Appendix A - Sample SOC node for MPC8540 +======================================== + + soc@e0000000 { + #address-cells = <1>; + #size-cells = <1>; + compatible = "fsl,mpc8540-ccsr", "simple-bus"; + device_type = "soc"; + ranges = <0x00000000 0xe0000000 0x00100000> + bus-frequency = <0>; + interrupt-parent = <&pic>; + + ethernet@24000 { + #address-cells = <1>; + #size-cells = <1>; + device_type = "network"; + model = "TSEC"; + compatible = "gianfar", "simple-bus"; + reg = <0x24000 0x1000>; + local-mac-address = [ 00 E0 0C 00 73 00 ]; + interrupts = <29 2 30 2 34 2>; + phy-handle = <&phy0>; + sleep = <&pmc 00000080>; + ranges; + + mdio@24520 { + reg = <0x24520 0x20>; + compatible = "fsl,gianfar-mdio"; + + phy0: ethernet-phy@0 { + interrupts = <5 1>; + reg = <0>; + device_type = "ethernet-phy"; + }; + + phy1: ethernet-phy@1 { + interrupts = <5 1>; + reg = <1>; + device_type = "ethernet-phy"; + }; + + phy3: ethernet-phy@3 { + interrupts = <7 1>; + reg = <3>; + device_type = "ethernet-phy"; + }; + }; + }; + + ethernet@25000 { + device_type = "network"; + model = "TSEC"; + compatible = "gianfar"; + reg = <0x25000 0x1000>; + local-mac-address = [ 00 E0 0C 00 73 01 ]; + interrupts = <13 2 14 2 18 2>; + phy-handle = <&phy1>; + sleep = <&pmc 00000040>; + }; + + ethernet@26000 { + device_type = "network"; + model = "FEC"; + compatible = "gianfar"; + reg = <0x26000 0x1000>; + local-mac-address = [ 00 E0 0C 00 73 02 ]; + interrupts = <41 2>; + phy-handle = <&phy3>; + sleep = <&pmc 00000020>; + }; + + serial@4500 { + #address-cells = <1>; + #size-cells = <1>; + compatible = "fsl,mpc8540-duart", "simple-bus"; + sleep = <&pmc 00000002>; + ranges; + + serial@4500 { + device_type = "serial"; + compatible = "ns16550"; + reg = <0x4500 0x100>; + clock-frequency = <0>; + interrupts = <42 2>; + }; + + serial@4600 { + device_type = "serial"; + compatible = "ns16550"; + reg = <0x4600 0x100>; + clock-frequency = <0>; + interrupts = <42 2>; + }; + }; + + pic: pic@40000 { + interrupt-controller; + #address-cells = <0>; + #interrupt-cells = <2>; + reg = <0x40000 0x40000>; + compatible = "chrp,open-pic"; + device_type = "open-pic"; + }; + + i2c@3000 { + interrupts = <43 2>; + reg = <0x3000 0x100>; + compatible = "fsl-i2c"; + dfsrr; + sleep = <&pmc 00000004>; + }; + + pmc: power@e0070 { + compatible = "fsl,mpc8540-pmc", "fsl,mpc8548-pmc"; + reg = <0xe0070 0x20>; + }; + }; diff --git a/Documentation/powerpc/booting-without-of.txt b/Documentation/powerpc/booting-without-of.txt deleted file mode 100644 index 7400d7555dc3..000000000000 --- a/Documentation/powerpc/booting-without-of.txt +++ /dev/null @@ -1,1447 +0,0 @@ - Booting the Linux/ppc kernel without Open Firmware - -------------------------------------------------- - -(c) 2005 Benjamin Herrenschmidt , - IBM Corp. -(c) 2005 Becky Bruce , - Freescale Semiconductor, FSL SOC and 32-bit additions -(c) 2006 MontaVista Software, Inc. - Flash chip node definition - -Table of Contents -================= - - I - Introduction - 1) Entry point for arch/powerpc - 2) Board support - - II - The DT block format - 1) Header - 2) Device tree generalities - 3) Device tree "structure" block - 4) Device tree "strings" block - - III - Required content of the device tree - 1) Note about cells and address representation - 2) Note about "compatible" properties - 3) Note about "name" properties - 4) Note about node and property names and character set - 5) Required nodes and properties - a) The root node - b) The /cpus node - c) The /cpus/* nodes - d) the /memory node(s) - e) The /chosen node - f) the /soc node - - IV - "dtc", the device tree compiler - - V - Recommendations for a bootloader - - VI - System-on-a-chip devices and nodes - 1) Defining child nodes of an SOC - 2) Representing devices without a current OF specification - a) PHY nodes - b) Interrupt controllers - c) 4xx/Axon EMAC ethernet nodes - d) Xilinx IP cores - e) USB EHCI controllers - f) MDIO on GPIOs - g) SPI busses - - VII - Specifying interrupt information for devices - 1) interrupts property - 2) interrupt-parent property - 3) OpenPIC Interrupt Controllers - 4) ISA Interrupt Controllers - - VIII - Specifying device power management information (sleep property) - - Appendix A - Sample SOC node for MPC8540 - - -Revision Information -==================== - - May 18, 2005: Rev 0.1 - Initial draft, no chapter III yet. - - May 19, 2005: Rev 0.2 - Add chapter III and bits & pieces here or - clarifies the fact that a lot of things are - optional, the kernel only requires a very - small device tree, though it is encouraged - to provide an as complete one as possible. - - May 24, 2005: Rev 0.3 - Precise that DT block has to be in RAM - - Misc fixes - - Define version 3 and new format version 16 - for the DT block (version 16 needs kernel - patches, will be fwd separately). - String block now has a size, and full path - is replaced by unit name for more - compactness. - linux,phandle is made optional, only nodes - that are referenced by other nodes need it. - "name" property is now automatically - deduced from the unit name - - June 1, 2005: Rev 0.4 - Correct confusion between OF_DT_END and - OF_DT_END_NODE in structure definition. - - Change version 16 format to always align - property data to 4 bytes. Since tokens are - already aligned, that means no specific - required alignment between property size - and property data. The old style variable - alignment would make it impossible to do - "simple" insertion of properties using - memmove (thanks Milton for - noticing). Updated kernel patch as well - - Correct a few more alignment constraints - - Add a chapter about the device-tree - compiler and the textural representation of - the tree that can be "compiled" by dtc. - - November 21, 2005: Rev 0.5 - - Additions/generalizations for 32-bit - - Changed to reflect the new arch/powerpc - structure - - Added chapter VI - - - ToDo: - - Add some definitions of interrupt tree (simple/complex) - - Add some definitions for PCI host bridges - - Add some common address format examples - - Add definitions for standard properties and "compatible" - names for cells that are not already defined by the existing - OF spec. - - Compare FSL SOC use of PCI to standard and make sure no new - node definition required. - - Add more information about node definitions for SOC devices - that currently have no standard, like the FSL CPM. - - -I - Introduction -================ - -During the recent development of the Linux/ppc64 kernel, and more -specifically, the addition of new platform types outside of the old -IBM pSeries/iSeries pair, it was decided to enforce some strict rules -regarding the kernel entry and bootloader <-> kernel interfaces, in -order to avoid the degeneration that had become the ppc32 kernel entry -point and the way a new platform should be added to the kernel. The -legacy iSeries platform breaks those rules as it predates this scheme, -but no new board support will be accepted in the main tree that -doesn't follow them properly. In addition, since the advent of the -arch/powerpc merged architecture for ppc32 and ppc64, new 32-bit -platforms and 32-bit platforms which move into arch/powerpc will be -required to use these rules as well. - -The main requirement that will be defined in more detail below is -the presence of a device-tree whose format is defined after Open -Firmware specification. However, in order to make life easier -to embedded board vendors, the kernel doesn't require the device-tree -to represent every device in the system and only requires some nodes -and properties to be present. This will be described in detail in -section III, but, for example, the kernel does not require you to -create a node for every PCI device in the system. It is a requirement -to have a node for PCI host bridges in order to provide interrupt -routing informations and memory/IO ranges, among others. It is also -recommended to define nodes for on chip devices and other busses that -don't specifically fit in an existing OF specification. This creates a -great flexibility in the way the kernel can then probe those and match -drivers to device, without having to hard code all sorts of tables. It -also makes it more flexible for board vendors to do minor hardware -upgrades without significantly impacting the kernel code or cluttering -it with special cases. - - -1) Entry point for arch/powerpc -------------------------------- - - There is one and one single entry point to the kernel, at the start - of the kernel image. That entry point supports two calling - conventions: - - a) Boot from Open Firmware. If your firmware is compatible - with Open Firmware (IEEE 1275) or provides an OF compatible - client interface API (support for "interpret" callback of - forth words isn't required), you can enter the kernel with: - - r5 : OF callback pointer as defined by IEEE 1275 - bindings to powerpc. Only the 32-bit client interface - is currently supported - - r3, r4 : address & length of an initrd if any or 0 - - The MMU is either on or off; the kernel will run the - trampoline located in arch/powerpc/kernel/prom_init.c to - extract the device-tree and other information from open - firmware and build a flattened device-tree as described - in b). prom_init() will then re-enter the kernel using - the second method. This trampoline code runs in the - context of the firmware, which is supposed to handle all - exceptions during that time. - - b) Direct entry with a flattened device-tree block. This entry - point is called by a) after the OF trampoline and can also be - called directly by a bootloader that does not support the Open - Firmware client interface. It is also used by "kexec" to - implement "hot" booting of a new kernel from a previous - running one. This method is what I will describe in more - details in this document, as method a) is simply standard Open - Firmware, and thus should be implemented according to the - various standard documents defining it and its binding to the - PowerPC platform. The entry point definition then becomes: - - r3 : physical pointer to the device-tree block - (defined in chapter II) in RAM - - r4 : physical pointer to the kernel itself. This is - used by the assembly code to properly disable the MMU - in case you are entering the kernel with MMU enabled - and a non-1:1 mapping. - - r5 : NULL (as to differentiate with method a) - - Note about SMP entry: Either your firmware puts your other - CPUs in some sleep loop or spin loop in ROM where you can get - them out via a soft reset or some other means, in which case - you don't need to care, or you'll have to enter the kernel - with all CPUs. The way to do that with method b) will be - described in a later revision of this document. - - -2) Board support ----------------- - -64-bit kernels: - - Board supports (platforms) are not exclusive config options. An - arbitrary set of board supports can be built in a single kernel - image. The kernel will "know" what set of functions to use for a - given platform based on the content of the device-tree. Thus, you - should: - - a) add your platform support as a _boolean_ option in - arch/powerpc/Kconfig, following the example of PPC_PSERIES, - PPC_PMAC and PPC_MAPLE. The later is probably a good - example of a board support to start from. - - b) create your main platform file as - "arch/powerpc/platforms/myplatform/myboard_setup.c" and add it - to the Makefile under the condition of your CONFIG_ - option. This file will define a structure of type "ppc_md" - containing the various callbacks that the generic code will - use to get to your platform specific code - - c) Add a reference to your "ppc_md" structure in the - "machines" table in arch/powerpc/kernel/setup_64.c if you are - a 64-bit platform. - - d) request and get assigned a platform number (see PLATFORM_* - constants in arch/powerpc/include/asm/processor.h - -32-bit embedded kernels: - - Currently, board support is essentially an exclusive config option. - The kernel is configured for a single platform. Part of the reason - for this is to keep kernels on embedded systems small and efficient; - part of this is due to the fact the code is already that way. In the - future, a kernel may support multiple platforms, but only if the - platforms feature the same core architecture. A single kernel build - cannot support both configurations with Book E and configurations - with classic Powerpc architectures. - - 32-bit embedded platforms that are moved into arch/powerpc using a - flattened device tree should adopt the merged tree practice of - setting ppc_md up dynamically, even though the kernel is currently - built with support for only a single platform at a time. This allows - unification of the setup code, and will make it easier to go to a - multiple-platform-support model in the future. - -NOTE: I believe the above will be true once Ben's done with the merge -of the boot sequences.... someone speak up if this is wrong! - - To add a 32-bit embedded platform support, follow the instructions - for 64-bit platforms above, with the exception that the Kconfig - option should be set up such that the kernel builds exclusively for - the platform selected. The processor type for the platform should - enable another config option to select the specific board - supported. - -NOTE: If Ben doesn't merge the setup files, may need to change this to -point to setup_32.c - - - I will describe later the boot process and various callbacks that - your platform should implement. - - -II - The DT block format -======================== - - -This chapter defines the actual format of the flattened device-tree -passed to the kernel. The actual content of it and kernel requirements -are described later. You can find example of code manipulating that -format in various places, including arch/powerpc/kernel/prom_init.c -which will generate a flattened device-tree from the Open Firmware -representation, or the fs2dt utility which is part of the kexec tools -which will generate one from a filesystem representation. It is -expected that a bootloader like uboot provides a bit more support, -that will be discussed later as well. - -Note: The block has to be in main memory. It has to be accessible in -both real mode and virtual mode with no mapping other than main -memory. If you are writing a simple flash bootloader, it should copy -the block to RAM before passing it to the kernel. - - -1) Header ---------- - - The kernel is entered with r3 pointing to an area of memory that is - roughly described in arch/powerpc/include/asm/prom.h by the structure - boot_param_header: - -struct boot_param_header { - u32 magic; /* magic word OF_DT_HEADER */ - u32 totalsize; /* total size of DT block */ - u32 off_dt_struct; /* offset to structure */ - u32 off_dt_strings; /* offset to strings */ - u32 off_mem_rsvmap; /* offset to memory reserve map - */ - u32 version; /* format version */ - u32 last_comp_version; /* last compatible version */ - - /* version 2 fields below */ - u32 boot_cpuid_phys; /* Which physical CPU id we're - booting on */ - /* version 3 fields below */ - u32 size_dt_strings; /* size of the strings block */ - - /* version 17 fields below */ - u32 size_dt_struct; /* size of the DT structure block */ -}; - - Along with the constants: - -/* Definitions used by the flattened device tree */ -#define OF_DT_HEADER 0xd00dfeed /* 4: version, - 4: total size */ -#define OF_DT_BEGIN_NODE 0x1 /* Start node: full name - */ -#define OF_DT_END_NODE 0x2 /* End node */ -#define OF_DT_PROP 0x3 /* Property: name off, - size, content */ -#define OF_DT_END 0x9 - - All values in this header are in big endian format, the various - fields in this header are defined more precisely below. All - "offset" values are in bytes from the start of the header; that is - from the value of r3. - - - magic - - This is a magic value that "marks" the beginning of the - device-tree block header. It contains the value 0xd00dfeed and is - defined by the constant OF_DT_HEADER - - - totalsize - - This is the total size of the DT block including the header. The - "DT" block should enclose all data structures defined in this - chapter (who are pointed to by offsets in this header). That is, - the device-tree structure, strings, and the memory reserve map. - - - off_dt_struct - - This is an offset from the beginning of the header to the start - of the "structure" part the device tree. (see 2) device tree) - - - off_dt_strings - - This is an offset from the beginning of the header to the start - of the "strings" part of the device-tree - - - off_mem_rsvmap - - This is an offset from the beginning of the header to the start - of the reserved memory map. This map is a list of pairs of 64- - bit integers. Each pair is a physical address and a size. The - list is terminated by an entry of size 0. This map provides the - kernel with a list of physical memory areas that are "reserved" - and thus not to be used for memory allocations, especially during - early initialization. The kernel needs to allocate memory during - boot for things like un-flattening the device-tree, allocating an - MMU hash table, etc... Those allocations must be done in such a - way to avoid overriding critical things like, on Open Firmware - capable machines, the RTAS instance, or on some pSeries, the TCE - tables used for the iommu. Typically, the reserve map should - contain _at least_ this DT block itself (header,total_size). If - you are passing an initrd to the kernel, you should reserve it as - well. You do not need to reserve the kernel image itself. The map - should be 64-bit aligned. - - - version - - This is the version of this structure. Version 1 stops - here. Version 2 adds an additional field boot_cpuid_phys. - Version 3 adds the size of the strings block, allowing the kernel - to reallocate it easily at boot and free up the unused flattened - structure after expansion. Version 16 introduces a new more - "compact" format for the tree itself that is however not backward - compatible. Version 17 adds an additional field, size_dt_struct, - allowing it to be reallocated or moved more easily (this is - particularly useful for bootloaders which need to make - adjustments to a device tree based on probed information). You - should always generate a structure of the highest version defined - at the time of your implementation. Currently that is version 17, - unless you explicitly aim at being backward compatible. - - - last_comp_version - - Last compatible version. This indicates down to what version of - the DT block you are backward compatible. For example, version 2 - is backward compatible with version 1 (that is, a kernel build - for version 1 will be able to boot with a version 2 format). You - should put a 1 in this field if you generate a device tree of - version 1 to 3, or 16 if you generate a tree of version 16 or 17 - using the new unit name format. - - - boot_cpuid_phys - - This field only exist on version 2 headers. It indicate which - physical CPU ID is calling the kernel entry point. This is used, - among others, by kexec. If you are on an SMP system, this value - should match the content of the "reg" property of the CPU node in - the device-tree corresponding to the CPU calling the kernel entry - point (see further chapters for more informations on the required - device-tree contents) - - - size_dt_strings - - This field only exists on version 3 and later headers. It - gives the size of the "strings" section of the device tree (which - starts at the offset given by off_dt_strings). - - - size_dt_struct - - This field only exists on version 17 and later headers. It gives - the size of the "structure" section of the device tree (which - starts at the offset given by off_dt_struct). - - So the typical layout of a DT block (though the various parts don't - need to be in that order) looks like this (addresses go from top to - bottom): - - - ------------------------------ - r3 -> | struct boot_param_header | - ------------------------------ - | (alignment gap) (*) | - ------------------------------ - | memory reserve map | - ------------------------------ - | (alignment gap) | - ------------------------------ - | | - | device-tree structure | - | | - ------------------------------ - | (alignment gap) | - ------------------------------ - | | - | device-tree strings | - | | - -----> ------------------------------ - | - | - --- (r3 + totalsize) - - (*) The alignment gaps are not necessarily present; their presence - and size are dependent on the various alignment requirements of - the individual data blocks. - - -2) Device tree generalities ---------------------------- - -This device-tree itself is separated in two different blocks, a -structure block and a strings block. Both need to be aligned to a 4 -byte boundary. - -First, let's quickly describe the device-tree concept before detailing -the storage format. This chapter does _not_ describe the detail of the -required types of nodes & properties for the kernel, this is done -later in chapter III. - -The device-tree layout is strongly inherited from the definition of -the Open Firmware IEEE 1275 device-tree. It's basically a tree of -nodes, each node having two or more named properties. A property can -have a value or not. - -It is a tree, so each node has one and only one parent except for the -root node who has no parent. - -A node has 2 names. The actual node name is generally contained in a -property of type "name" in the node property list whose value is a -zero terminated string and is mandatory for version 1 to 3 of the -format definition (as it is in Open Firmware). Version 16 makes it -optional as it can generate it from the unit name defined below. - -There is also a "unit name" that is used to differentiate nodes with -the same name at the same level, it is usually made of the node -names, the "@" sign, and a "unit address", which definition is -specific to the bus type the node sits on. - -The unit name doesn't exist as a property per-se but is included in -the device-tree structure. It is typically used to represent "path" in -the device-tree. More details about the actual format of these will be -below. - -The kernel powerpc generic code does not make any formal use of the -unit address (though some board support code may do) so the only real -requirement here for the unit address is to ensure uniqueness of -the node unit name at a given level of the tree. Nodes with no notion -of address and no possible sibling of the same name (like /memory or -/cpus) may omit the unit address in the context of this specification, -or use the "@0" default unit address. The unit name is used to define -a node "full path", which is the concatenation of all parent node -unit names separated with "/". - -The root node doesn't have a defined name, and isn't required to have -a name property either if you are using version 3 or earlier of the -format. It also has no unit address (no @ symbol followed by a unit -address). The root node unit name is thus an empty string. The full -path to the root node is "/". - -Every node which actually represents an actual device (that is, a node -which isn't only a virtual "container" for more nodes, like "/cpus" -is) is also required to have a "device_type" property indicating the -type of node . - -Finally, every node that can be referenced from a property in another -node is required to have a "linux,phandle" property. Real open -firmware implementations provide a unique "phandle" value for every -node that the "prom_init()" trampoline code turns into -"linux,phandle" properties. However, this is made optional if the -flattened device tree is used directly. An example of a node -referencing another node via "phandle" is when laying out the -interrupt tree which will be described in a further version of this -document. - -This "linux, phandle" property is a 32-bit value that uniquely -identifies a node. You are free to use whatever values or system of -values, internal pointers, or whatever to generate these, the only -requirement is that every node for which you provide that property has -a unique value for it. - -Here is an example of a simple device-tree. In this example, an "o" -designates a node followed by the node unit name. Properties are -presented with their name followed by their content. "content" -represents an ASCII string (zero terminated) value, while -represents a 32-bit hexadecimal value. The various nodes in this -example will be discussed in a later chapter. At this point, it is -only meant to give you a idea of what a device-tree looks like. I have -purposefully kept the "name" and "linux,phandle" properties which -aren't necessary in order to give you a better idea of what the tree -looks like in practice. - - / o device-tree - |- name = "device-tree" - |- model = "MyBoardName" - |- compatible = "MyBoardFamilyName" - |- #address-cells = <2> - |- #size-cells = <2> - |- linux,phandle = <0> - | - o cpus - | | - name = "cpus" - | | - linux,phandle = <1> - | | - #address-cells = <1> - | | - #size-cells = <0> - | | - | o PowerPC,970@0 - | |- name = "PowerPC,970" - | |- device_type = "cpu" - | |- reg = <0> - | |- clock-frequency = <5f5e1000> - | |- 64-bit - | |- linux,phandle = <2> - | - o memory@0 - | |- name = "memory" - | |- device_type = "memory" - | |- reg = <00000000 00000000 00000000 20000000> - | |- linux,phandle = <3> - | - o chosen - |- name = "chosen" - |- bootargs = "root=/dev/sda2" - |- linux,phandle = <4> - -This tree is almost a minimal tree. It pretty much contains the -minimal set of required nodes and properties to boot a linux kernel; -that is, some basic model informations at the root, the CPUs, and the -physical memory layout. It also includes misc information passed -through /chosen, like in this example, the platform type (mandatory) -and the kernel command line arguments (optional). - -The /cpus/PowerPC,970@0/64-bit property is an example of a -property without a value. All other properties have a value. The -significance of the #address-cells and #size-cells properties will be -explained in chapter IV which defines precisely the required nodes and -properties and their content. - - -3) Device tree "structure" block - -The structure of the device tree is a linearized tree structure. The -"OF_DT_BEGIN_NODE" token starts a new node, and the "OF_DT_END_NODE" -ends that node definition. Child nodes are simply defined before -"OF_DT_END_NODE" (that is nodes within the node). A 'token' is a 32 -bit value. The tree has to be "finished" with a OF_DT_END token - -Here's the basic structure of a single node: - - * token OF_DT_BEGIN_NODE (that is 0x00000001) - * for version 1 to 3, this is the node full path as a zero - terminated string, starting with "/". For version 16 and later, - this is the node unit name only (or an empty string for the - root node) - * [align gap to next 4 bytes boundary] - * for each property: - * token OF_DT_PROP (that is 0x00000003) - * 32-bit value of property value size in bytes (or 0 if no - value) - * 32-bit value of offset in string block of property name - * property value data if any - * [align gap to next 4 bytes boundary] - * [child nodes if any] - * token OF_DT_END_NODE (that is 0x00000002) - -So the node content can be summarized as a start token, a full path, -a list of properties, a list of child nodes, and an end token. Every -child node is a full node structure itself as defined above. - -NOTE: The above definition requires that all property definitions for -a particular node MUST precede any subnode definitions for that node. -Although the structure would not be ambiguous if properties and -subnodes were intermingled, the kernel parser requires that the -properties come first (up until at least 2.6.22). Any tools -manipulating a flattened tree must take care to preserve this -constraint. - -4) Device tree "strings" block - -In order to save space, property names, which are generally redundant, -are stored separately in the "strings" block. This block is simply the -whole bunch of zero terminated strings for all property names -concatenated together. The device-tree property definitions in the -structure block will contain offset values from the beginning of the -strings block. - - -III - Required content of the device tree -========================================= - -WARNING: All "linux,*" properties defined in this document apply only -to a flattened device-tree. If your platform uses a real -implementation of Open Firmware or an implementation compatible with -the Open Firmware client interface, those properties will be created -by the trampoline code in the kernel's prom_init() file. For example, -that's where you'll have to add code to detect your board model and -set the platform number. However, when using the flattened device-tree -entry point, there is no prom_init() pass, and thus you have to -provide those properties yourself. - - -1) Note about cells and address representation ----------------------------------------------- - -The general rule is documented in the various Open Firmware -documentations. If you choose to describe a bus with the device-tree -and there exist an OF bus binding, then you should follow the -specification. However, the kernel does not require every single -device or bus to be described by the device tree. - -In general, the format of an address for a device is defined by the -parent bus type, based on the #address-cells and #size-cells -properties. Note that the parent's parent definitions of #address-cells -and #size-cells are not inherited so every node with children must specify -them. The kernel requires the root node to have those properties defining -addresses format for devices directly mapped on the processor bus. - -Those 2 properties define 'cells' for representing an address and a -size. A "cell" is a 32-bit number. For example, if both contain 2 -like the example tree given above, then an address and a size are both -composed of 2 cells, and each is a 64-bit number (cells are -concatenated and expected to be in big endian format). Another example -is the way Apple firmware defines them, with 2 cells for an address -and one cell for a size. Most 32-bit implementations should define -#address-cells and #size-cells to 1, which represents a 32-bit value. -Some 32-bit processors allow for physical addresses greater than 32 -bits; these processors should define #address-cells as 2. - -"reg" properties are always a tuple of the type "address size" where -the number of cells of address and size is specified by the bus -#address-cells and #size-cells. When a bus supports various address -spaces and other flags relative to a given address allocation (like -prefetchable, etc...) those flags are usually added to the top level -bits of the physical address. For example, a PCI physical address is -made of 3 cells, the bottom two containing the actual address itself -while the top cell contains address space indication, flags, and pci -bus & device numbers. - -For busses that support dynamic allocation, it's the accepted practice -to then not provide the address in "reg" (keep it 0) though while -providing a flag indicating the address is dynamically allocated, and -then, to provide a separate "assigned-addresses" property that -contains the fully allocated addresses. See the PCI OF bindings for -details. - -In general, a simple bus with no address space bits and no dynamic -allocation is preferred if it reflects your hardware, as the existing -kernel address parsing functions will work out of the box. If you -define a bus type with a more complex address format, including things -like address space bits, you'll have to add a bus translator to the -prom_parse.c file of the recent kernels for your bus type. - -The "reg" property only defines addresses and sizes (if #size-cells is -non-0) within a given bus. In order to translate addresses upward -(that is into parent bus addresses, and possibly into CPU physical -addresses), all busses must contain a "ranges" property. If the -"ranges" property is missing at a given level, it's assumed that -translation isn't possible, i.e., the registers are not visible on the -parent bus. The format of the "ranges" property for a bus is a list -of: - - bus address, parent bus address, size - -"bus address" is in the format of the bus this bus node is defining, -that is, for a PCI bridge, it would be a PCI address. Thus, (bus -address, size) defines a range of addresses for child devices. "parent -bus address" is in the format of the parent bus of this bus. For -example, for a PCI host controller, that would be a CPU address. For a -PCI<->ISA bridge, that would be a PCI address. It defines the base -address in the parent bus where the beginning of that range is mapped. - -For a new 64-bit powerpc board, I recommend either the 2/2 format or -Apple's 2/1 format which is slightly more compact since sizes usually -fit in a single 32-bit word. New 32-bit powerpc boards should use a -1/1 format, unless the processor supports physical addresses greater -than 32-bits, in which case a 2/1 format is recommended. - -Alternatively, the "ranges" property may be empty, indicating that the -registers are visible on the parent bus using an identity mapping -translation. In other words, the parent bus address space is the same -as the child bus address space. - -2) Note about "compatible" properties -------------------------------------- - -These properties are optional, but recommended in devices and the root -node. The format of a "compatible" property is a list of concatenated -zero terminated strings. They allow a device to express its -compatibility with a family of similar devices, in some cases, -allowing a single driver to match against several devices regardless -of their actual names. - -3) Note about "name" properties -------------------------------- - -While earlier users of Open Firmware like OldWorld macintoshes tended -to use the actual device name for the "name" property, it's nowadays -considered a good practice to use a name that is closer to the device -class (often equal to device_type). For example, nowadays, ethernet -controllers are named "ethernet", an additional "model" property -defining precisely the chip type/model, and "compatible" property -defining the family in case a single driver can driver more than one -of these chips. However, the kernel doesn't generally put any -restriction on the "name" property; it is simply considered good -practice to follow the standard and its evolutions as closely as -possible. - -Note also that the new format version 16 makes the "name" property -optional. If it's absent for a node, then the node's unit name is then -used to reconstruct the name. That is, the part of the unit name -before the "@" sign is used (or the entire unit name if no "@" sign -is present). - -4) Note about node and property names and character set -------------------------------------------------------- - -While open firmware provides more flexible usage of 8859-1, this -specification enforces more strict rules. Nodes and properties should -be comprised only of ASCII characters 'a' to 'z', '0' to -'9', ',', '.', '_', '+', '#', '?', and '-'. Node names additionally -allow uppercase characters 'A' to 'Z' (property names should be -lowercase. The fact that vendors like Apple don't respect this rule is -irrelevant here). Additionally, node and property names should always -begin with a character in the range 'a' to 'z' (or 'A' to 'Z' for node -names). - -The maximum number of characters for both nodes and property names -is 31. In the case of node names, this is only the leftmost part of -a unit name (the pure "name" property), it doesn't include the unit -address which can extend beyond that limit. - - -5) Required nodes and properties --------------------------------- - These are all that are currently required. However, it is strongly - recommended that you expose PCI host bridges as documented in the - PCI binding to open firmware, and your interrupt tree as documented - in OF interrupt tree specification. - - a) The root node - - The root node requires some properties to be present: - - - model : this is your board name/model - - #address-cells : address representation for "root" devices - - #size-cells: the size representation for "root" devices - - device_type : This property shouldn't be necessary. However, if - you decide to create a device_type for your root node, make sure it - is _not_ "chrp" unless your platform is a pSeries or PAPR compliant - one for 64-bit, or a CHRP-type machine for 32-bit as this will - matched by the kernel this way. - - Additionally, some recommended properties are: - - - compatible : the board "family" generally finds its way here, - for example, if you have 2 board models with a similar layout, - that typically get driven by the same platform code in the - kernel, you would use a different "model" property but put a - value in "compatible". The kernel doesn't directly use that - value but it is generally useful. - - The root node is also generally where you add additional properties - specific to your board like the serial number if any, that sort of - thing. It is recommended that if you add any "custom" property whose - name may clash with standard defined ones, you prefix them with your - vendor name and a comma. - - b) The /cpus node - - This node is the parent of all individual CPU nodes. It doesn't - have any specific requirements, though it's generally good practice - to have at least: - - #address-cells = <00000001> - #size-cells = <00000000> - - This defines that the "address" for a CPU is a single cell, and has - no meaningful size. This is not necessary but the kernel will assume - that format when reading the "reg" properties of a CPU node, see - below - - c) The /cpus/* nodes - - So under /cpus, you are supposed to create a node for every CPU on - the machine. There is no specific restriction on the name of the - CPU, though It's common practice to call it PowerPC,. For - example, Apple uses PowerPC,G5 while IBM uses PowerPC,970FX. - - Required properties: - - - device_type : has to be "cpu" - - reg : This is the physical CPU number, it's a single 32-bit cell - and is also used as-is as the unit number for constructing the - unit name in the full path. For example, with 2 CPUs, you would - have the full path: - /cpus/PowerPC,970FX@0 - /cpus/PowerPC,970FX@1 - (unit addresses do not require leading zeroes) - - d-cache-block-size : one cell, L1 data cache block size in bytes (*) - - i-cache-block-size : one cell, L1 instruction cache block size in - bytes - - d-cache-size : one cell, size of L1 data cache in bytes - - i-cache-size : one cell, size of L1 instruction cache in bytes - -(*) The cache "block" size is the size on which the cache management -instructions operate. Historically, this document used the cache -"line" size here which is incorrect. The kernel will prefer the cache -block size and will fallback to cache line size for backward -compatibility. - - Recommended properties: - - - timebase-frequency : a cell indicating the frequency of the - timebase in Hz. This is not directly used by the generic code, - but you are welcome to copy/paste the pSeries code for setting - the kernel timebase/decrementer calibration based on this - value. - - clock-frequency : a cell indicating the CPU core clock frequency - in Hz. A new property will be defined for 64-bit values, but if - your frequency is < 4Ghz, one cell is enough. Here as well as - for the above, the common code doesn't use that property, but - you are welcome to re-use the pSeries or Maple one. A future - kernel version might provide a common function for this. - - d-cache-line-size : one cell, L1 data cache line size in bytes - if different from the block size - - i-cache-line-size : one cell, L1 instruction cache line size in - bytes if different from the block size - - You are welcome to add any property you find relevant to your board, - like some information about the mechanism used to soft-reset the - CPUs. For example, Apple puts the GPIO number for CPU soft reset - lines in there as a "soft-reset" property since they start secondary - CPUs by soft-resetting them. - - - d) the /memory node(s) - - To define the physical memory layout of your board, you should - create one or more memory node(s). You can either create a single - node with all memory ranges in its reg property, or you can create - several nodes, as you wish. The unit address (@ part) used for the - full path is the address of the first range of memory defined by a - given node. If you use a single memory node, this will typically be - @0. - - Required properties: - - - device_type : has to be "memory" - - reg : This property contains all the physical memory ranges of - your board. It's a list of addresses/sizes concatenated - together, with the number of cells of each defined by the - #address-cells and #size-cells of the root node. For example, - with both of these properties being 2 like in the example given - earlier, a 970 based machine with 6Gb of RAM could typically - have a "reg" property here that looks like: - - 00000000 00000000 00000000 80000000 - 00000001 00000000 00000001 00000000 - - That is a range starting at 0 of 0x80000000 bytes and a range - starting at 0x100000000 and of 0x100000000 bytes. You can see - that there is no memory covering the IO hole between 2Gb and - 4Gb. Some vendors prefer splitting those ranges into smaller - segments, but the kernel doesn't care. - - e) The /chosen node - - This node is a bit "special". Normally, that's where open firmware - puts some variable environment information, like the arguments, or - the default input/output devices. - - This specification makes a few of these mandatory, but also defines - some linux-specific properties that would be normally constructed by - the prom_init() trampoline when booting with an OF client interface, - but that you have to provide yourself when using the flattened format. - - Recommended properties: - - - bootargs : This zero-terminated string is passed as the kernel - command line - - linux,stdout-path : This is the full path to your standard - console device if any. Typically, if you have serial devices on - your board, you may want to put the full path to the one set as - the default console in the firmware here, for the kernel to pick - it up as its own default console. If you look at the function - set_preferred_console() in arch/ppc64/kernel/setup.c, you'll see - that the kernel tries to find out the default console and has - knowledge of various types like 8250 serial ports. You may want - to extend this function to add your own. - - Note that u-boot creates and fills in the chosen node for platforms - that use it. - - (Note: a practice that is now obsolete was to include a property - under /chosen called interrupt-controller which had a phandle value - that pointed to the main interrupt controller) - - f) the /soc node - - This node is used to represent a system-on-a-chip (SOC) and must be - present if the processor is a SOC. The top-level soc node contains - information that is global to all devices on the SOC. The node name - should contain a unit address for the SOC, which is the base address - of the memory-mapped register set for the SOC. The name of an soc - node should start with "soc", and the remainder of the name should - represent the part number for the soc. For example, the MPC8540's - soc node would be called "soc8540". - - Required properties: - - - device_type : Should be "soc" - - ranges : Should be defined as specified in 1) to describe the - translation of SOC addresses for memory mapped SOC registers. - - bus-frequency: Contains the bus frequency for the SOC node. - Typically, the value of this field is filled in by the boot - loader. - - - Recommended properties: - - - reg : This property defines the address and size of the - memory-mapped registers that are used for the SOC node itself. - It does not include the child device registers - these will be - defined inside each child node. The address specified in the - "reg" property should match the unit address of the SOC node. - - #address-cells : Address representation for "soc" devices. The - format of this field may vary depending on whether or not the - device registers are memory mapped. For memory mapped - registers, this field represents the number of cells needed to - represent the address of the registers. For SOCs that do not - use MMIO, a special address format should be defined that - contains enough cells to represent the required information. - See 1) above for more details on defining #address-cells. - - #size-cells : Size representation for "soc" devices - - #interrupt-cells : Defines the width of cells used to represent - interrupts. Typically this value is <2>, which includes a - 32-bit number that represents the interrupt number, and a - 32-bit number that represents the interrupt sense and level. - This field is only needed if the SOC contains an interrupt - controller. - - The SOC node may contain child nodes for each SOC device that the - platform uses. Nodes should not be created for devices which exist - on the SOC but are not used by a particular platform. See chapter VI - for more information on how to specify devices that are part of a SOC. - - Example SOC node for the MPC8540: - - soc8540@e0000000 { - #address-cells = <1>; - #size-cells = <1>; - #interrupt-cells = <2>; - device_type = "soc"; - ranges = <00000000 e0000000 00100000> - reg = ; - bus-frequency = <0>; - } - - - -IV - "dtc", the device tree compiler -==================================== - - -dtc source code can be found at - - -WARNING: This version is still in early development stage; the -resulting device-tree "blobs" have not yet been validated with the -kernel. The current generated block lacks a useful reserve map (it will -be fixed to generate an empty one, it's up to the bootloader to fill -it up) among others. The error handling needs work, bugs are lurking, -etc... - -dtc basically takes a device-tree in a given format and outputs a -device-tree in another format. The currently supported formats are: - - Input formats: - ------------- - - - "dtb": "blob" format, that is a flattened device-tree block - with - header all in a binary blob. - - "dts": "source" format. This is a text file containing a - "source" for a device-tree. The format is defined later in this - chapter. - - "fs" format. This is a representation equivalent to the - output of /proc/device-tree, that is nodes are directories and - properties are files - - Output formats: - --------------- - - - "dtb": "blob" format - - "dts": "source" format - - "asm": assembly language file. This is a file that can be - sourced by gas to generate a device-tree "blob". That file can - then simply be added to your Makefile. Additionally, the - assembly file exports some symbols that can be used. - - -The syntax of the dtc tool is - - dtc [-I ] [-O ] - [-o output-filename] [-V output_version] input_filename - - -The "output_version" defines what version of the "blob" format will be -generated. Supported versions are 1,2,3 and 16. The default is -currently version 3 but that may change in the future to version 16. - -Additionally, dtc performs various sanity checks on the tree, like the -uniqueness of linux, phandle properties, validity of strings, etc... - -The format of the .dts "source" file is "C" like, supports C and C++ -style comments. - -/ { -} - -The above is the "device-tree" definition. It's the only statement -supported currently at the toplevel. - -/ { - property1 = "string_value"; /* define a property containing a 0 - * terminated string - */ - - property2 = <1234abcd>; /* define a property containing a - * numerical 32-bit value (hexadecimal) - */ - - property3 = <12345678 12345678 deadbeef>; - /* define a property containing 3 - * numerical 32-bit values (cells) in - * hexadecimal - */ - property4 = [0a 0b 0c 0d de ea ad be ef]; - /* define a property whose content is - * an arbitrary array of bytes - */ - - childnode@address { /* define a child node named "childnode" - * whose unit name is "childnode at - * address" - */ - - childprop = "hello\n"; /* define a property "childprop" of - * childnode (in this case, a string) - */ - }; -}; - -Nodes can contain other nodes etc... thus defining the hierarchical -structure of the tree. - -Strings support common escape sequences from C: "\n", "\t", "\r", -"\(octal value)", "\x(hex value)". - -It is also suggested that you pipe your source file through cpp (gcc -preprocessor) so you can use #include's, #define for constants, etc... - -Finally, various options are planned but not yet implemented, like -automatic generation of phandles, labels (exported to the asm file so -you can point to a property content and change it easily from whatever -you link the device-tree with), label or path instead of numeric value -in some cells to "point" to a node (replaced by a phandle at compile -time), export of reserve map address to the asm file, ability to -specify reserve map content at compile time, etc... - -We may provide a .h include file with common definitions of that -proves useful for some properties (like building PCI properties or -interrupt maps) though it may be better to add a notion of struct -definitions to the compiler... - - -V - Recommendations for a bootloader -==================================== - - -Here are some various ideas/recommendations that have been proposed -while all this has been defined and implemented. - - - The bootloader may want to be able to use the device-tree itself - and may want to manipulate it (to add/edit some properties, - like physical memory size or kernel arguments). At this point, 2 - choices can be made. Either the bootloader works directly on the - flattened format, or the bootloader has its own internal tree - representation with pointers (similar to the kernel one) and - re-flattens the tree when booting the kernel. The former is a bit - more difficult to edit/modify, the later requires probably a bit - more code to handle the tree structure. Note that the structure - format has been designed so it's relatively easy to "insert" - properties or nodes or delete them by just memmoving things - around. It contains no internal offsets or pointers for this - purpose. - - - An example of code for iterating nodes & retrieving properties - directly from the flattened tree format can be found in the kernel - file arch/ppc64/kernel/prom.c, look at scan_flat_dt() function, - its usage in early_init_devtree(), and the corresponding various - early_init_dt_scan_*() callbacks. That code can be re-used in a - GPL bootloader, and as the author of that code, I would be happy - to discuss possible free licensing to any vendor who wishes to - integrate all or part of this code into a non-GPL bootloader. - - - -VI - System-on-a-chip devices and nodes -======================================= - -Many companies are now starting to develop system-on-a-chip -processors, where the processor core (CPU) and many peripheral devices -exist on a single piece of silicon. For these SOCs, an SOC node -should be used that defines child nodes for the devices that make -up the SOC. While platforms are not required to use this model in -order to boot the kernel, it is highly encouraged that all SOC -implementations define as complete a flat-device-tree as possible to -describe the devices on the SOC. This will allow for the -genericization of much of the kernel code. - - -1) Defining child nodes of an SOC ---------------------------------- - -Each device that is part of an SOC may have its own node entry inside -the SOC node. For each device that is included in the SOC, the unit -address property represents the address offset for this device's -memory-mapped registers in the parent's address space. The parent's -address space is defined by the "ranges" property in the top-level soc -node. The "reg" property for each node that exists directly under the -SOC node should contain the address mapping from the child address space -to the parent SOC address space and the size of the device's -memory-mapped register file. - -For many devices that may exist inside an SOC, there are predefined -specifications for the format of the device tree node. All SOC child -nodes should follow these specifications, except where noted in this -document. - -See appendix A for an example partial SOC node definition for the -MPC8540. - - -2) Representing devices without a current OF specification ----------------------------------------------------------- - -Currently, there are many devices on SOCs that do not have a standard -representation pre-defined as part of the open firmware -specifications, mainly because the boards that contain these SOCs are -not currently booted using open firmware. This section contains -descriptions for the SOC devices for which new nodes have been -defined; this list will expand as more and more SOC-containing -platforms are moved over to use the flattened-device-tree model. - -VII - Specifying interrupt information for devices -=================================================== - -The device tree represents the busses and devices of a hardware -system in a form similar to the physical bus topology of the -hardware. - -In addition, a logical 'interrupt tree' exists which represents the -hierarchy and routing of interrupts in the hardware. - -The interrupt tree model is fully described in the -document "Open Firmware Recommended Practice: Interrupt -Mapping Version 0.9". The document is available at: -. - -1) interrupts property ----------------------- - -Devices that generate interrupts to a single interrupt controller -should use the conventional OF representation described in the -OF interrupt mapping documentation. - -Each device which generates interrupts must have an 'interrupt' -property. The interrupt property value is an arbitrary number of -of 'interrupt specifier' values which describe the interrupt or -interrupts for the device. - -The encoding of an interrupt specifier is determined by the -interrupt domain in which the device is located in the -interrupt tree. The root of an interrupt domain specifies in -its #interrupt-cells property the number of 32-bit cells -required to encode an interrupt specifier. See the OF interrupt -mapping documentation for a detailed description of domains. - -For example, the binding for the OpenPIC interrupt controller -specifies an #interrupt-cells value of 2 to encode the interrupt -number and level/sense information. All interrupt children in an -OpenPIC interrupt domain use 2 cells per interrupt in their interrupts -property. - -The PCI bus binding specifies a #interrupt-cell value of 1 to encode -which interrupt pin (INTA,INTB,INTC,INTD) is used. - -2) interrupt-parent property ----------------------------- - -The interrupt-parent property is specified to define an explicit -link between a device node and its interrupt parent in -the interrupt tree. The value of interrupt-parent is the -phandle of the parent node. - -If the interrupt-parent property is not defined for a node, its -interrupt parent is assumed to be an ancestor in the node's -_device tree_ hierarchy. - -3) OpenPIC Interrupt Controllers --------------------------------- - -OpenPIC interrupt controllers require 2 cells to encode -interrupt information. The first cell defines the interrupt -number. The second cell defines the sense and level -information. - -Sense and level information should be encoded as follows: - - 0 = low to high edge sensitive type enabled - 1 = active low level sensitive type enabled - 2 = active high level sensitive type enabled - 3 = high to low edge sensitive type enabled - -4) ISA Interrupt Controllers ----------------------------- - -ISA PIC interrupt controllers require 2 cells to encode -interrupt information. The first cell defines the interrupt -number. The second cell defines the sense and level -information. - -ISA PIC interrupt controllers should adhere to the ISA PIC -encodings listed below: - - 0 = active low level sensitive type enabled - 1 = active high level sensitive type enabled - 2 = high to low edge sensitive type enabled - 3 = low to high edge sensitive type enabled - -VIII - Specifying Device Power Management Information (sleep property) -=================================================================== - -Devices on SOCs often have mechanisms for placing devices into low-power -states that are decoupled from the devices' own register blocks. Sometimes, -this information is more complicated than a cell-index property can -reasonably describe. Thus, each device controlled in such a manner -may contain a "sleep" property which describes these connections. - -The sleep property consists of one or more sleep resources, each of -which consists of a phandle to a sleep controller, followed by a -controller-specific sleep specifier of zero or more cells. - -The semantics of what type of low power modes are possible are defined -by the sleep controller. Some examples of the types of low power modes -that may be supported are: - - - Dynamic: The device may be disabled or enabled at any time. - - System Suspend: The device may request to be disabled or remain - awake during system suspend, but will not be disabled until then. - - Permanent: The device is disabled permanently (until the next hard - reset). - -Some devices may share a clock domain with each other, such that they should -only be suspended when none of the devices are in use. Where reasonable, -such nodes should be placed on a virtual bus, where the bus has the sleep -property. If the clock domain is shared among devices that cannot be -reasonably grouped in this manner, then create a virtual sleep controller -(similar to an interrupt nexus, except that defining a standardized -sleep-map should wait until its necessity is demonstrated). - -Appendix A - Sample SOC node for MPC8540 -======================================== - - soc@e0000000 { - #address-cells = <1>; - #size-cells = <1>; - compatible = "fsl,mpc8540-ccsr", "simple-bus"; - device_type = "soc"; - ranges = <0x00000000 0xe0000000 0x00100000> - bus-frequency = <0>; - interrupt-parent = <&pic>; - - ethernet@24000 { - #address-cells = <1>; - #size-cells = <1>; - device_type = "network"; - model = "TSEC"; - compatible = "gianfar", "simple-bus"; - reg = <0x24000 0x1000>; - local-mac-address = [ 00 E0 0C 00 73 00 ]; - interrupts = <29 2 30 2 34 2>; - phy-handle = <&phy0>; - sleep = <&pmc 00000080>; - ranges; - - mdio@24520 { - reg = <0x24520 0x20>; - compatible = "fsl,gianfar-mdio"; - - phy0: ethernet-phy@0 { - interrupts = <5 1>; - reg = <0>; - device_type = "ethernet-phy"; - }; - - phy1: ethernet-phy@1 { - interrupts = <5 1>; - reg = <1>; - device_type = "ethernet-phy"; - }; - - phy3: ethernet-phy@3 { - interrupts = <7 1>; - reg = <3>; - device_type = "ethernet-phy"; - }; - }; - }; - - ethernet@25000 { - device_type = "network"; - model = "TSEC"; - compatible = "gianfar"; - reg = <0x25000 0x1000>; - local-mac-address = [ 00 E0 0C 00 73 01 ]; - interrupts = <13 2 14 2 18 2>; - phy-handle = <&phy1>; - sleep = <&pmc 00000040>; - }; - - ethernet@26000 { - device_type = "network"; - model = "FEC"; - compatible = "gianfar"; - reg = <0x26000 0x1000>; - local-mac-address = [ 00 E0 0C 00 73 02 ]; - interrupts = <41 2>; - phy-handle = <&phy3>; - sleep = <&pmc 00000020>; - }; - - serial@4500 { - #address-cells = <1>; - #size-cells = <1>; - compatible = "fsl,mpc8540-duart", "simple-bus"; - sleep = <&pmc 00000002>; - ranges; - - serial@4500 { - device_type = "serial"; - compatible = "ns16550"; - reg = <0x4500 0x100>; - clock-frequency = <0>; - interrupts = <42 2>; - }; - - serial@4600 { - device_type = "serial"; - compatible = "ns16550"; - reg = <0x4600 0x100>; - clock-frequency = <0>; - interrupts = <42 2>; - }; - }; - - pic: pic@40000 { - interrupt-controller; - #address-cells = <0>; - #interrupt-cells = <2>; - reg = <0x40000 0x40000>; - compatible = "chrp,open-pic"; - device_type = "open-pic"; - }; - - i2c@3000 { - interrupts = <43 2>; - reg = <0x3000 0x100>; - compatible = "fsl-i2c"; - dfsrr; - sleep = <&pmc 00000004>; - }; - - pmc: power@e0070 { - compatible = "fsl,mpc8540-pmc", "fsl,mpc8548-pmc"; - reg = <0xe0070 0x20>; - }; - }; diff --git a/Documentation/powerpc/dts-bindings/4xx/cpm.txt b/Documentation/powerpc/dts-bindings/4xx/cpm.txt deleted file mode 100644 index ee459806d35e..000000000000 --- a/Documentation/powerpc/dts-bindings/4xx/cpm.txt +++ /dev/null @@ -1,52 +0,0 @@ -PPC4xx Clock Power Management (CPM) node - -Required properties: - - compatible : compatible list, currently only "ibm,cpm" - - dcr-access-method : "native" - - dcr-reg : < DCR register range > - -Optional properties: - - er-offset : All 4xx SoCs with a CPM controller have - one of two different order for the CPM - registers. Some have the CPM registers - in the following order (ER,FR,SR). The - others have them in the following order - (SR,ER,FR). For the second case set - er-offset = <1>. - - unused-units : specifier consist of one cell. For each - bit in the cell, the corresponding bit - in CPM will be set to turn off unused - devices. - - idle-doze : specifier consist of one cell. For each - bit in the cell, the corresponding bit - in CPM will be set to turn off unused - devices. This is usually just CPM[CPU]. - - standby : specifier consist of one cell. For each - bit in the cell, the corresponding bit - in CPM will be set on standby and - restored on resume. - - suspend : specifier consist of one cell. For each - bit in the cell, the corresponding bit - in CPM will be set on suspend (mem) and - restored on resume. Note, for standby - and suspend the corresponding bits can - be different or the same. Usually for - standby only class 2 and 3 units are set. - However, the interface does not care. - If they are the same, the additional - power saving will be seeing if support - is available to put the DDR in self - refresh mode and any additional power - saving techniques for the specific SoC. - -Example: - CPM0: cpm { - compatible = "ibm,cpm"; - dcr-access-method = "native"; - dcr-reg = <0x160 0x003>; - er-offset = <0>; - unused-units = <0x00000100>; - idle-doze = <0x02000000>; - standby = <0xfeff0000>; - suspend = <0xfeff791d>; -}; diff --git a/Documentation/powerpc/dts-bindings/4xx/emac.txt b/Documentation/powerpc/dts-bindings/4xx/emac.txt deleted file mode 100644 index 2161334a7ca5..000000000000 --- a/Documentation/powerpc/dts-bindings/4xx/emac.txt +++ /dev/null @@ -1,148 +0,0 @@ - 4xx/Axon EMAC ethernet nodes - - The EMAC ethernet controller in IBM and AMCC 4xx chips, and also - the Axon bridge. To operate this needs to interact with a ths - special McMAL DMA controller, and sometimes an RGMII or ZMII - interface. In addition to the nodes and properties described - below, the node for the OPB bus on which the EMAC sits must have a - correct clock-frequency property. - - i) The EMAC node itself - - Required properties: - - device_type : "network" - - - compatible : compatible list, contains 2 entries, first is - "ibm,emac-CHIP" where CHIP is the host ASIC (440gx, - 405gp, Axon) and second is either "ibm,emac" or - "ibm,emac4". For Axon, thus, we have: "ibm,emac-axon", - "ibm,emac4" - - interrupts : - - interrupt-parent : optional, if needed for interrupt mapping - - reg : - - local-mac-address : 6 bytes, MAC address - - mal-device : phandle of the associated McMAL node - - mal-tx-channel : 1 cell, index of the tx channel on McMAL associated - with this EMAC - - mal-rx-channel : 1 cell, index of the rx channel on McMAL associated - with this EMAC - - cell-index : 1 cell, hardware index of the EMAC cell on a given - ASIC (typically 0x0 and 0x1 for EMAC0 and EMAC1 on - each Axon chip) - - max-frame-size : 1 cell, maximum frame size supported in bytes - - rx-fifo-size : 1 cell, Rx fifo size in bytes for 10 and 100 Mb/sec - operations. - For Axon, 2048 - - tx-fifo-size : 1 cell, Tx fifo size in bytes for 10 and 100 Mb/sec - operations. - For Axon, 2048. - - fifo-entry-size : 1 cell, size of a fifo entry (used to calculate - thresholds). - For Axon, 0x00000010 - - mal-burst-size : 1 cell, MAL burst size (used to calculate thresholds) - in bytes. - For Axon, 0x00000100 (I think ...) - - phy-mode : string, mode of operations of the PHY interface. - Supported values are: "mii", "rmii", "smii", "rgmii", - "tbi", "gmii", rtbi", "sgmii". - For Axon on CAB, it is "rgmii" - - mdio-device : 1 cell, required iff using shared MDIO registers - (440EP). phandle of the EMAC to use to drive the - MDIO lines for the PHY used by this EMAC. - - zmii-device : 1 cell, required iff connected to a ZMII. phandle of - the ZMII device node - - zmii-channel : 1 cell, required iff connected to a ZMII. Which ZMII - channel or 0xffffffff if ZMII is only used for MDIO. - - rgmii-device : 1 cell, required iff connected to an RGMII. phandle - of the RGMII device node. - For Axon: phandle of plb5/plb4/opb/rgmii - - rgmii-channel : 1 cell, required iff connected to an RGMII. Which - RGMII channel is used by this EMAC. - Fox Axon: present, whatever value is appropriate for each - EMAC, that is the content of the current (bogus) "phy-port" - property. - - Optional properties: - - phy-address : 1 cell, optional, MDIO address of the PHY. If absent, - a search is performed. - - phy-map : 1 cell, optional, bitmap of addresses to probe the PHY - for, used if phy-address is absent. bit 0x00000001 is - MDIO address 0. - For Axon it can be absent, though my current driver - doesn't handle phy-address yet so for now, keep - 0x00ffffff in it. - - rx-fifo-size-gige : 1 cell, Rx fifo size in bytes for 1000 Mb/sec - operations (if absent the value is the same as - rx-fifo-size). For Axon, either absent or 2048. - - tx-fifo-size-gige : 1 cell, Tx fifo size in bytes for 1000 Mb/sec - operations (if absent the value is the same as - tx-fifo-size). For Axon, either absent or 2048. - - tah-device : 1 cell, optional. If connected to a TAH engine for - offload, phandle of the TAH device node. - - tah-channel : 1 cell, optional. If appropriate, channel used on the - TAH engine. - - Example: - - EMAC0: ethernet@40000800 { - device_type = "network"; - compatible = "ibm,emac-440gp", "ibm,emac"; - interrupt-parent = <&UIC1>; - interrupts = <1c 4 1d 4>; - reg = <40000800 70>; - local-mac-address = [00 04 AC E3 1B 1E]; - mal-device = <&MAL0>; - mal-tx-channel = <0 1>; - mal-rx-channel = <0>; - cell-index = <0>; - max-frame-size = <5dc>; - rx-fifo-size = <1000>; - tx-fifo-size = <800>; - phy-mode = "rmii"; - phy-map = <00000001>; - zmii-device = <&ZMII0>; - zmii-channel = <0>; - }; - - ii) McMAL node - - Required properties: - - device_type : "dma-controller" - - compatible : compatible list, containing 2 entries, first is - "ibm,mcmal-CHIP" where CHIP is the host ASIC (like - emac) and the second is either "ibm,mcmal" or - "ibm,mcmal2". - For Axon, "ibm,mcmal-axon","ibm,mcmal2" - - interrupts : . - For Axon: This is _different_ from the current - firmware. We use the "delayed" interrupts for txeob - and rxeob. Thus we end up with mapping those 5 MPIC - interrupts, all level positive sensitive: 10, 11, 32, - 33, 34 (in decimal) - - dcr-reg : < DCR registers range > - - dcr-parent : if needed for dcr-reg - - num-tx-chans : 1 cell, number of Tx channels - - num-rx-chans : 1 cell, number of Rx channels - - iii) ZMII node - - Required properties: - - compatible : compatible list, containing 2 entries, first is - "ibm,zmii-CHIP" where CHIP is the host ASIC (like - EMAC) and the second is "ibm,zmii". - For Axon, there is no ZMII node. - - reg : - - iv) RGMII node - - Required properties: - - compatible : compatible list, containing 2 entries, first is - "ibm,rgmii-CHIP" where CHIP is the host ASIC (like - EMAC) and the second is "ibm,rgmii". - For Axon, "ibm,rgmii-axon","ibm,rgmii" - - reg : - - revision : as provided by the RGMII new version register if - available. - For Axon: 0x0000012a - diff --git a/Documentation/powerpc/dts-bindings/4xx/ndfc.txt b/Documentation/powerpc/dts-bindings/4xx/ndfc.txt deleted file mode 100644 index 869f0b5f16e8..000000000000 --- a/Documentation/powerpc/dts-bindings/4xx/ndfc.txt +++ /dev/null @@ -1,39 +0,0 @@ -AMCC NDFC (NanD Flash Controller) - -Required properties: -- compatible : "ibm,ndfc". -- reg : should specify chip select and size used for the chip (0x2000). - -Optional properties: -- ccr : NDFC config and control register value (default 0). -- bank-settings : NDFC bank configuration register value (default 0). - -Notes: -- partition(s) - follows the OF MTD standard for partitions - -Example: - -ndfc@1,0 { - compatible = "ibm,ndfc"; - reg = <0x00000001 0x00000000 0x00002000>; - ccr = <0x00001000>; - bank-settings = <0x80002222>; - #address-cells = <1>; - #size-cells = <1>; - - nand { - #address-cells = <1>; - #size-cells = <1>; - - partition@0 { - label = "kernel"; - reg = <0x00000000 0x00200000>; - }; - partition@200000 { - label = "root"; - reg = <0x00200000 0x03E00000>; - }; - }; -}; - - diff --git a/Documentation/powerpc/dts-bindings/4xx/ppc440spe-adma.txt b/Documentation/powerpc/dts-bindings/4xx/ppc440spe-adma.txt deleted file mode 100644 index 515ebcf1b97d..000000000000 --- a/Documentation/powerpc/dts-bindings/4xx/ppc440spe-adma.txt +++ /dev/null @@ -1,93 +0,0 @@ -PPC440SPe DMA/XOR (DMA Controller and XOR Accelerator) - -Device nodes needed for operation of the ppc440spe-adma driver -are specified hereby. These are I2O/DMA, DMA and XOR nodes -for DMA engines and Memory Queue Module node. The latter is used -by ADMA driver for configuration of RAID-6 H/W capabilities of -the PPC440SPe. In addition to the nodes and properties described -below, the ranges property of PLB node must specify ranges for -DMA devices. - - i) The I2O node - - Required properties: - - - compatible : "ibm,i2o-440spe"; - - reg : - - dcr-reg : - - Example: - - I2O: i2o@400100000 { - compatible = "ibm,i2o-440spe"; - reg = <0x00000004 0x00100000 0x100>; - dcr-reg = <0x060 0x020>; - }; - - - ii) The DMA node - - Required properties: - - - compatible : "ibm,dma-440spe"; - - cell-index : 1 cell, hardware index of the DMA engine - (typically 0x0 and 0x1 for DMA0 and DMA1) - - reg : - - dcr-reg : - - interrupts : . - - interrupt-parent : needed for interrupt mapping - - Example: - - DMA0: dma0@400100100 { - compatible = "ibm,dma-440spe"; - cell-index = <0>; - reg = <0x00000004 0x00100100 0x100>; - dcr-reg = <0x060 0x020>; - interrupt-parent = <&DMA0>; - interrupts = <0 1>; - #interrupt-cells = <1>; - #address-cells = <0>; - #size-cells = <0>; - interrupt-map = < - 0 &UIC0 0x14 4 - 1 &UIC1 0x16 4>; - }; - - - iii) XOR Accelerator node - - Required properties: - - - compatible : "amcc,xor-accelerator"; - - reg : - - interrupts : - - interrupt-parent : for interrupt mapping - - Example: - - xor-accel@400200000 { - compatible = "amcc,xor-accelerator"; - reg = <0x00000004 0x00200000 0x400>; - interrupt-parent = <&UIC1>; - interrupts = <0x1f 4>; - }; - - - iv) Memory Queue Module node - - Required properties: - - - compatible : "ibm,mq-440spe"; - - dcr-reg : - - Example: - - MQ0: mq { - compatible = "ibm,mq-440spe"; - dcr-reg = <0x040 0x020>; - }; - diff --git a/Documentation/powerpc/dts-bindings/4xx/reboot.txt b/Documentation/powerpc/dts-bindings/4xx/reboot.txt deleted file mode 100644 index d7217260589c..000000000000 --- a/Documentation/powerpc/dts-bindings/4xx/reboot.txt +++ /dev/null @@ -1,18 +0,0 @@ -Reboot property to control system reboot on PPC4xx systems: - -By setting "reset_type" to one of the following values, the default -software reset mechanism may be overidden. Here the possible values of -"reset_type": - - 1 - PPC4xx core reset - 2 - PPC4xx chip reset - 3 - PPC4xx system reset (default) - -Example: - - cpu@0 { - device_type = "cpu"; - model = "PowerPC,440SPe"; - ... - reset-type = <2>; /* Use chip-reset */ - }; diff --git a/Documentation/powerpc/dts-bindings/can/sja1000.txt b/Documentation/powerpc/dts-bindings/can/sja1000.txt deleted file mode 100644 index d6d209ded937..000000000000 --- a/Documentation/powerpc/dts-bindings/can/sja1000.txt +++ /dev/null @@ -1,53 +0,0 @@ -Memory mapped SJA1000 CAN controller from NXP (formerly Philips) - -Required properties: - -- compatible : should be "nxp,sja1000". - -- reg : should specify the chip select, address offset and size required - to map the registers of the SJA1000. The size is usually 0x80. - -- interrupts: property with a value describing the interrupt source - (number and sensitivity) required for the SJA1000. - -Optional properties: - -- nxp,external-clock-frequency : Frequency of the external oscillator - clock in Hz. Note that the internal clock frequency used by the - SJA1000 is half of that value. If not specified, a default value - of 16000000 (16 MHz) is used. - -- nxp,tx-output-mode : operation mode of the TX output control logic: - <0x0> : bi-phase output mode - <0x1> : normal output mode (default) - <0x2> : test output mode - <0x3> : clock output mode - -- nxp,tx-output-config : TX output pin configuration: - <0x01> : TX0 invert - <0x02> : TX0 pull-down (default) - <0x04> : TX0 pull-up - <0x06> : TX0 push-pull - <0x08> : TX1 invert - <0x10> : TX1 pull-down - <0x20> : TX1 pull-up - <0x30> : TX1 push-pull - -- nxp,clock-out-frequency : clock frequency in Hz on the CLKOUT pin. - If not specified or if the specified value is 0, the CLKOUT pin - will be disabled. - -- nxp,no-comparator-bypass : Allows to disable the CAN input comperator. - -For futher information, please have a look to the SJA1000 data sheet. - -Examples: - -can@3,100 { - compatible = "nxp,sja1000"; - reg = <3 0x100 0x80>; - interrupts = <2 0>; - interrupt-parent = <&mpic>; - nxp,external-clock-frequency = <16000000>; -}; - diff --git a/Documentation/powerpc/dts-bindings/ecm.txt b/Documentation/powerpc/dts-bindings/ecm.txt deleted file mode 100644 index f514f29c67d6..000000000000 --- a/Documentation/powerpc/dts-bindings/ecm.txt +++ /dev/null @@ -1,64 +0,0 @@ -===================================================================== -E500 LAW & Coherency Module Device Tree Binding -Copyright (C) 2009 Freescale Semiconductor Inc. -===================================================================== - -Local Access Window (LAW) Node - -The LAW node represents the region of CCSR space where local access -windows are configured. For ECM based devices this is the first 4k -of CCSR space that includes CCSRBAR, ALTCBAR, ALTCAR, BPTR, and some -number of local access windows as specified by fsl,num-laws. - -PROPERTIES - - - compatible - Usage: required - Value type: - Definition: Must include "fsl,ecm-law" - - - reg - Usage: required - Value type: - Definition: A standard property. The value specifies the - physical address offset and length of the CCSR space - registers. - - - fsl,num-laws - Usage: required - Value type: - Definition: The value specifies the number of local access - windows for this device. - -===================================================================== - -E500 Coherency Module Node - -The E500 LAW node represents the region of CCSR space where ECM config -and error reporting registers exist, this is the second 4k (0x1000) -of CCSR space. - -PROPERTIES - - - compatible - Usage: required - Value type: - Definition: Must include "fsl,CHIP-ecm", "fsl,ecm" where - CHIP is the processor (mpc8572, mpc8544, etc.) - - - reg - Usage: required - Value type: - Definition: A standard property. The value specifies the - physical address offset and length of the CCSR space - registers. - - - interrupts - Usage: required - Value type: - - - interrupt-parent - Usage: required - Value type: - -===================================================================== diff --git a/Documentation/powerpc/dts-bindings/eeprom.txt b/Documentation/powerpc/dts-bindings/eeprom.txt deleted file mode 100644 index 4342c10de1bf..000000000000 --- a/Documentation/powerpc/dts-bindings/eeprom.txt +++ /dev/null @@ -1,28 +0,0 @@ -EEPROMs (I2C) - -Required properties: - - - compatible : should be "," - If there is no specific driver for , a generic - driver based on is selected. Possible types are: - 24c00, 24c01, 24c02, 24c04, 24c08, 24c16, 24c32, 24c64, - 24c128, 24c256, 24c512, 24c1024, spd - - - reg : the I2C address of the EEPROM - -Optional properties: - - - pagesize : the length of the pagesize for writing. Please consult the - manual of your device, that value varies a lot. A wrong value - may result in data loss! If not specified, a safety value of - '1' is used which will be very slow. - - - read-only: this parameterless property disables writes to the eeprom - -Example: - -eeprom@52 { - compatible = "atmel,24c32"; - reg = <0x52>; - pagesize = <32>; -}; diff --git a/Documentation/powerpc/dts-bindings/fsl/83xx-512x-pci.txt b/Documentation/powerpc/dts-bindings/fsl/83xx-512x-pci.txt deleted file mode 100644 index 35a465362408..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/83xx-512x-pci.txt +++ /dev/null @@ -1,40 +0,0 @@ -* Freescale 83xx and 512x PCI bridges - -Freescale 83xx and 512x SOCs include the same pci bridge core. - -83xx/512x specific notes: -- reg: should contain two address length tuples - The first is for the internal pci bridge registers - The second is for the pci config space access registers - -Example (MPC8313ERDB) - pci0: pci@e0008500 { - cell-index = <1>; - interrupt-map-mask = <0xf800 0x0 0x0 0x7>; - interrupt-map = < - /* IDSEL 0x0E -mini PCI */ - 0x7000 0x0 0x0 0x1 &ipic 18 0x8 - 0x7000 0x0 0x0 0x2 &ipic 18 0x8 - 0x7000 0x0 0x0 0x3 &ipic 18 0x8 - 0x7000 0x0 0x0 0x4 &ipic 18 0x8 - - /* IDSEL 0x0F - PCI slot */ - 0x7800 0x0 0x0 0x1 &ipic 17 0x8 - 0x7800 0x0 0x0 0x2 &ipic 18 0x8 - 0x7800 0x0 0x0 0x3 &ipic 17 0x8 - 0x7800 0x0 0x0 0x4 &ipic 18 0x8>; - interrupt-parent = <&ipic>; - interrupts = <66 0x8>; - bus-range = <0x0 0x0>; - ranges = <0x02000000 0x0 0x90000000 0x90000000 0x0 0x10000000 - 0x42000000 0x0 0x80000000 0x80000000 0x0 0x10000000 - 0x01000000 0x0 0x00000000 0xe2000000 0x0 0x00100000>; - clock-frequency = <66666666>; - #interrupt-cells = <1>; - #size-cells = <2>; - #address-cells = <3>; - reg = <0xe0008500 0x100 /* internal registers */ - 0xe0008300 0x8>; /* config space access registers */ - compatible = "fsl,mpc8349-pci"; - device_type = "pci"; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/8xxx_gpio.txt b/Documentation/powerpc/dts-bindings/fsl/8xxx_gpio.txt deleted file mode 100644 index b0019eb5330e..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/8xxx_gpio.txt +++ /dev/null @@ -1,60 +0,0 @@ -GPIO controllers on MPC8xxx SoCs - -This is for the non-QE/CPM/GUTs GPIO controllers as found on -8349, 8572, 8610 and compatible. - -Every GPIO controller node must have #gpio-cells property defined, -this information will be used to translate gpio-specifiers. - -Required properties: -- compatible : "fsl,-gpio" followed by "fsl,mpc8349-gpio" for - 83xx, "fsl,mpc8572-gpio" for 85xx and "fsl,mpc8610-gpio" for 86xx. -- #gpio-cells : Should be two. The first cell is the pin number and the - second cell is used to specify optional parameters (currently unused). - - interrupts : Interrupt mapping for GPIO IRQ. - - interrupt-parent : Phandle for the interrupt controller that - services interrupts for this device. -- gpio-controller : Marks the port as GPIO controller. - -Example of gpio-controller nodes for a MPC8347 SoC: - - gpio1: gpio-controller@c00 { - #gpio-cells = <2>; - compatible = "fsl,mpc8347-gpio", "fsl,mpc8349-gpio"; - reg = <0xc00 0x100>; - interrupts = <74 0x8>; - interrupt-parent = <&ipic>; - gpio-controller; - }; - - gpio2: gpio-controller@d00 { - #gpio-cells = <2>; - compatible = "fsl,mpc8347-gpio", "fsl,mpc8349-gpio"; - reg = <0xd00 0x100>; - interrupts = <75 0x8>; - interrupt-parent = <&ipic>; - gpio-controller; - }; - -See booting-without-of.txt for details of how to specify GPIO -information for devices. - -To use GPIO pins as interrupt sources for peripherals, specify the -GPIO controller as the interrupt parent and define GPIO number + -trigger mode using the interrupts property, which is defined like -this: - -interrupts = , where: - - number: GPIO pin (0..31) - - trigger: trigger mode: - 2 = trigger on falling edge - 3 = trigger on both edges - -Example of device using this is: - - funkyfpga@0 { - compatible = "funky-fpga"; - ... - interrupts = <4 3>; - interrupt-parent = <&gpio1>; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/board.txt b/Documentation/powerpc/dts-bindings/fsl/board.txt deleted file mode 100644 index 39e941515a36..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/board.txt +++ /dev/null @@ -1,63 +0,0 @@ -* Board Control and Status (BCSR) - -Required properties: - - - compatible : Should be "fsl,-bcsr" - - reg : Offset and length of the register set for the device - -Example: - - bcsr@f8000000 { - compatible = "fsl,mpc8360mds-bcsr"; - reg = ; - }; - -* Freescale on board FPGA - -This is the memory-mapped registers for on board FPGA. - -Required properities: -- compatible : should be "fsl,fpga-pixis". -- reg : should contain the address and the length of the FPPGA register - set. -- interrupt-parent: should specify phandle for the interrupt controller. -- interrupts : should specify event (wakeup) IRQ. - -Example (MPC8610HPCD): - - board-control@e8000000 { - compatible = "fsl,fpga-pixis"; - reg = <0xe8000000 32>; - interrupt-parent = <&mpic>; - interrupts = <8 8>; - }; - -* Freescale BCSR GPIO banks - -Some BCSR registers act as simple GPIO controllers, each such -register can be represented by the gpio-controller node. - -Required properities: -- compatible : Should be "fsl,-bcsr-gpio". -- reg : Should contain the address and the length of the GPIO bank - register. -- #gpio-cells : Should be two. The first cell is the pin number and the - second cell is used to specify optional parameters (currently unused). -- gpio-controller : Marks the port as GPIO controller. - -Example: - - bcsr@1,0 { - #address-cells = <1>; - #size-cells = <1>; - compatible = "fsl,mpc8360mds-bcsr"; - reg = <1 0 0x8000>; - ranges = <0 1 0 0x8000>; - - bcsr13: gpio-controller@d { - #gpio-cells = <2>; - compatible = "fsl,mpc8360mds-bcsr-gpio"; - reg = <0xd 1>; - gpio-controller; - }; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/can.txt b/Documentation/powerpc/dts-bindings/fsl/can.txt deleted file mode 100644 index 2fa4fcd38fd6..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/can.txt +++ /dev/null @@ -1,53 +0,0 @@ -CAN Device Tree Bindings ------------------------- - -(c) 2006-2009 Secret Lab Technologies Ltd -Grant Likely - -fsl,mpc5200-mscan nodes ------------------------ -In addition to the required compatible-, reg- and interrupt-properties, you can -also specify which clock source shall be used for the controller: - -- fsl,mscan-clock-source : a string describing the clock source. Valid values - are: "ip" for ip bus clock - "ref" for reference clock (XTAL) - "ref" is default in case this property is not - present. - -fsl,mpc5121-mscan nodes ------------------------ -In addition to the required compatible-, reg- and interrupt-properties, you can -also specify which clock source and divider shall be used for the controller: - -- fsl,mscan-clock-source : a string describing the clock source. Valid values - are: "ip" for ip bus clock - "ref" for reference clock - "sys" for system clock - If this property is not present, an optimal CAN - clock source and frequency based on the system - clock will be selected. If this is not possible, - the reference clock will be used. - -- fsl,mscan-clock-divider: for the reference and system clock, an additional - clock divider can be specified. By default, a - value of 1 is used. - -Note that the MPC5121 Rev. 1 processor is not supported. - -Examples: - can@1300 { - compatible = "fsl,mpc5121-mscan"; - interrupts = <12 0x8>; - interrupt-parent = <&ipic>; - reg = <0x1300 0x80>; - }; - - can@1380 { - compatible = "fsl,mpc5121-mscan"; - interrupts = <13 0x8>; - interrupt-parent = <&ipic>; - reg = <0x1380 0x80>; - fsl,mscan-clock-source = "ref"; - fsl,mscan-clock-divider = <3>; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm.txt b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm.txt deleted file mode 100644 index 160c752484b4..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm.txt +++ /dev/null @@ -1,67 +0,0 @@ -* Freescale Communications Processor Module - -NOTE: This is an interim binding, and will likely change slightly, -as more devices are supported. The QE bindings especially are -incomplete. - -* Root CPM node - -Properties: -- compatible : "fsl,cpm1", "fsl,cpm2", or "fsl,qe". -- reg : A 48-byte region beginning with CPCR. - -Example: - cpm@119c0 { - #address-cells = <1>; - #size-cells = <1>; - #interrupt-cells = <2>; - compatible = "fsl,mpc8272-cpm", "fsl,cpm2"; - reg = <119c0 30>; - } - -* Properties common to multiple CPM/QE devices - -- fsl,cpm-command : This value is ORed with the opcode and command flag - to specify the device on which a CPM command operates. - -- fsl,cpm-brg : Indicates which baud rate generator the device - is associated with. If absent, an unused BRG - should be dynamically allocated. If zero, the - device uses an external clock rather than a BRG. - -- reg : Unless otherwise specified, the first resource represents the - scc/fcc/ucc registers, and the second represents the device's - parameter RAM region (if it has one). - -* Multi-User RAM (MURAM) - -The multi-user/dual-ported RAM is expressed as a bus under the CPM node. - -Ranges must be set up subject to the following restrictions: - -- Children's reg nodes must be offsets from the start of all muram, even - if the user-data area does not begin at zero. -- If multiple range entries are used, the difference between the parent - address and the child address must be the same in all, so that a single - mapping can cover them all while maintaining the ability to determine - CPM-side offsets with pointer subtraction. It is recommended that - multiple range entries not be used. -- A child address of zero must be translatable, even if no reg resources - contain it. - -A child "data" node must exist, compatible with "fsl,cpm-muram-data", to -indicate the portion of muram that is usable by the OS for arbitrary -purposes. The data node may have an arbitrary number of reg resources, -all of which contribute to the allocatable muram pool. - -Example, based on mpc8272: - muram@0 { - #address-cells = <1>; - #size-cells = <1>; - ranges = <0 0 10000>; - - data@0 { - compatible = "fsl,cpm-muram-data"; - reg = <0 2000 9800 800>; - }; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm/brg.txt b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm/brg.txt deleted file mode 100644 index 4c7d45eaf025..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm/brg.txt +++ /dev/null @@ -1,21 +0,0 @@ -* Baud Rate Generators - -Currently defined compatibles: -fsl,cpm-brg -fsl,cpm1-brg -fsl,cpm2-brg - -Properties: -- reg : There may be an arbitrary number of reg resources; BRG - numbers are assigned to these in order. -- clock-frequency : Specifies the base frequency driving - the BRG. - -Example: - brg@119f0 { - compatible = "fsl,mpc8272-brg", - "fsl,cpm2-brg", - "fsl,cpm-brg"; - reg = <119f0 10 115f0 10>; - clock-frequency = ; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm/i2c.txt b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm/i2c.txt deleted file mode 100644 index 87bc6048667e..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm/i2c.txt +++ /dev/null @@ -1,41 +0,0 @@ -* I2C - -The I2C controller is expressed as a bus under the CPM node. - -Properties: -- compatible : "fsl,cpm1-i2c", "fsl,cpm2-i2c" -- reg : On CPM2 devices, the second resource doesn't specify the I2C - Parameter RAM itself, but the I2C_BASE field of the CPM2 Parameter RAM - (typically 0x8afc 0x2). -- #address-cells : Should be one. The cell is the i2c device address with - the r/w bit set to zero. -- #size-cells : Should be zero. -- clock-frequency : Can be used to set the i2c clock frequency. If - unspecified, a default frequency of 60kHz is being used. -The following two properties are deprecated. They are only used by legacy -i2c drivers to find the bus to probe: -- linux,i2c-index : Can be used to hard code an i2c bus number. By default, - the bus number is dynamically assigned by the i2c core. -- linux,i2c-class : Can be used to override the i2c class. The class is used - by legacy i2c device drivers to find a bus in a specific context like - system management, video or sound. By default, I2C_CLASS_HWMON (1) is - being used. The definition of the classes can be found in - include/i2c/i2c.h - -Example, based on mpc823: - - i2c@860 { - compatible = "fsl,mpc823-i2c", - "fsl,cpm1-i2c"; - reg = <0x860 0x20 0x3c80 0x30>; - interrupts = <16>; - interrupt-parent = <&CPM_PIC>; - fsl,cpm-command = <0x10>; - #address-cells = <1>; - #size-cells = <0>; - - rtc@68 { - compatible = "dallas,ds1307"; - reg = <0x68>; - }; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm/pic.txt b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm/pic.txt deleted file mode 100644 index 8e3ee1681618..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm/pic.txt +++ /dev/null @@ -1,18 +0,0 @@ -* Interrupt Controllers - -Currently defined compatibles: -- fsl,cpm1-pic - - only one interrupt cell -- fsl,pq1-pic -- fsl,cpm2-pic - - second interrupt cell is level/sense: - - 2 is falling edge - - 8 is active low - -Example: - interrupt-controller@10c00 { - #interrupt-cells = <2>; - interrupt-controller; - reg = <10c00 80>; - compatible = "mpc8272-pic", "fsl,cpm2-pic"; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm/usb.txt b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm/usb.txt deleted file mode 100644 index 74bfda4bb824..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm/usb.txt +++ /dev/null @@ -1,15 +0,0 @@ -* USB (Universal Serial Bus Controller) - -Properties: -- compatible : "fsl,cpm1-usb", "fsl,cpm2-usb", "fsl,qe-usb" - -Example: - usb@11bc0 { - #address-cells = <1>; - #size-cells = <0>; - compatible = "fsl,cpm2-usb"; - reg = <11b60 18 8b00 100>; - interrupts = ; - interrupt-parent = <&PIC>; - fsl,cpm-command = <2e600000>; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/gpio.txt b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/gpio.txt deleted file mode 100644 index 349f79fd7076..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/gpio.txt +++ /dev/null @@ -1,38 +0,0 @@ -Every GPIO controller node must have #gpio-cells property defined, -this information will be used to translate gpio-specifiers. - -On CPM1 devices, all ports are using slightly different register layouts. -Ports A, C and D are 16bit ports and Ports B and E are 32bit ports. - -On CPM2 devices, all ports are 32bit ports and use a common register layout. - -Required properties: -- compatible : "fsl,cpm1-pario-bank-a", "fsl,cpm1-pario-bank-b", - "fsl,cpm1-pario-bank-c", "fsl,cpm1-pario-bank-d", - "fsl,cpm1-pario-bank-e", "fsl,cpm2-pario-bank" -- #gpio-cells : Should be two. The first cell is the pin number and the - second cell is used to specify optional parameters (currently unused). -- gpio-controller : Marks the port as GPIO controller. - -Example of three SOC GPIO banks defined as gpio-controller nodes: - - CPM1_PIO_A: gpio-controller@950 { - #gpio-cells = <2>; - compatible = "fsl,cpm1-pario-bank-a"; - reg = <0x950 0x10>; - gpio-controller; - }; - - CPM1_PIO_B: gpio-controller@ab8 { - #gpio-cells = <2>; - compatible = "fsl,cpm1-pario-bank-b"; - reg = <0xab8 0x10>; - gpio-controller; - }; - - CPM1_PIO_E: gpio-controller@ac8 { - #gpio-cells = <2>; - compatible = "fsl,cpm1-pario-bank-e"; - reg = <0xac8 0x18>; - gpio-controller; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/network.txt b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/network.txt deleted file mode 100644 index 0e4269446580..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/network.txt +++ /dev/null @@ -1,45 +0,0 @@ -* Network - -Currently defined compatibles: -- fsl,cpm1-scc-enet -- fsl,cpm2-scc-enet -- fsl,cpm1-fec-enet -- fsl,cpm2-fcc-enet (third resource is GFEMR) -- fsl,qe-enet - -Example: - - ethernet@11300 { - device_type = "network"; - compatible = "fsl,mpc8272-fcc-enet", - "fsl,cpm2-fcc-enet"; - reg = <11300 20 8400 100 11390 1>; - local-mac-address = [ 00 00 00 00 00 00 ]; - interrupts = <20 8>; - interrupt-parent = <&PIC>; - phy-handle = <&PHY0>; - fsl,cpm-command = <12000300>; - }; - -* MDIO - -Currently defined compatibles: -fsl,pq1-fec-mdio (reg is same as first resource of FEC device) -fsl,cpm2-mdio-bitbang (reg is port C registers) - -Properties for fsl,cpm2-mdio-bitbang: -fsl,mdio-pin : pin of port C controlling mdio data -fsl,mdc-pin : pin of port C controlling mdio clock - -Example: - mdio@10d40 { - device_type = "mdio"; - compatible = "fsl,mpc8272ads-mdio-bitbang", - "fsl,mpc8272-mdio-bitbang", - "fsl,cpm2-mdio-bitbang"; - reg = <10d40 14>; - #address-cells = <1>; - #size-cells = <0>; - fsl,mdio-pin = <12>; - fsl,mdc-pin = <13>; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt deleted file mode 100644 index 4f8930263dd9..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt +++ /dev/null @@ -1,115 +0,0 @@ -* Freescale QUICC Engine module (QE) -This represents qe module that is installed on PowerQUICC II Pro. - -NOTE: This is an interim binding; it should be updated to fit -in with the CPM binding later in this document. - -Basically, it is a bus of devices, that could act more or less -as a complete entity (UCC, USB etc ). All of them should be siblings on -the "root" qe node, using the common properties from there. -The description below applies to the qe of MPC8360 and -more nodes and properties would be extended in the future. - -i) Root QE device - -Required properties: -- compatible : should be "fsl,qe"; -- model : precise model of the QE, Can be "QE", "CPM", or "CPM2" -- reg : offset and length of the device registers. -- bus-frequency : the clock frequency for QUICC Engine. -- fsl,qe-num-riscs: define how many RISC engines the QE has. -- fsl,qe-num-snums: define how many serial number(SNUM) the QE can use for the - threads. - -Optional properties: -- fsl,firmware-phandle: - Usage: required only if there is no fsl,qe-firmware child node - Value type: - Definition: Points to a firmware node (see "QE Firmware Node" below) - that contains the firmware that should be uploaded for this QE. - The compatible property for the firmware node should say, - "fsl,qe-firmware". - -Recommended properties -- brg-frequency : the internal clock source frequency for baud-rate - generators in Hz. - -Example: - qe@e0100000 { - #address-cells = <1>; - #size-cells = <1>; - #interrupt-cells = <2>; - compatible = "fsl,qe"; - ranges = <0 e0100000 00100000>; - reg = ; - brg-frequency = <0>; - bus-frequency = <179A7B00>; - } - -* Multi-User RAM (MURAM) - -Required properties: -- compatible : should be "fsl,qe-muram", "fsl,cpm-muram". -- mode : the could be "host" or "slave". -- ranges : Should be defined as specified in 1) to describe the - translation of MURAM addresses. -- data-only : sub-node which defines the address area under MURAM - bus that can be allocated as data/parameter - -Example: - - muram@10000 { - compatible = "fsl,qe-muram", "fsl,cpm-muram"; - ranges = <0 00010000 0000c000>; - - data-only@0{ - compatible = "fsl,qe-muram-data", - "fsl,cpm-muram-data"; - reg = <0 c000>; - }; - }; - -* QE Firmware Node - -This node defines a firmware binary that is embedded in the device tree, for -the purpose of passing the firmware from bootloader to the kernel, or from -the hypervisor to the guest. - -The firmware node itself contains the firmware binary contents, a compatible -property, and any firmware-specific properties. The node should be placed -inside a QE node that needs it. Doing so eliminates the need for a -fsl,firmware-phandle property. Other QE nodes that need the same firmware -should define an fsl,firmware-phandle property that points to the firmware node -in the first QE node. - -The fsl,firmware property can be specified in the DTS (possibly using incbin) -or can be inserted by the boot loader at boot time. - -Required properties: - - compatible - Usage: required - Value type: - Definition: A standard property. Specify a string that indicates what - kind of firmware it is. For QE, this should be "fsl,qe-firmware". - - - fsl,firmware - Usage: required - Value type: , encoded as an array of bytes - Definition: A standard property. This property contains the firmware - binary "blob". - -Example: - qe1@e0080000 { - compatible = "fsl,qe"; - qe_firmware:qe-firmware { - compatible = "fsl,qe-firmware"; - fsl,firmware = [0x70 0xcd 0x00 0x00 0x01 0x46 0x45 ...]; - }; - ... - }; - - qe2@e0090000 { - compatible = "fsl,qe"; - fsl,firmware-phandle = <&qe_firmware>; - ... - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe/firmware.txt b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe/firmware.txt deleted file mode 100644 index 249db3a15d15..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe/firmware.txt +++ /dev/null @@ -1,24 +0,0 @@ -* Uploaded QE firmware - - If a new firmware has been uploaded to the QE (usually by the - boot loader), then a 'firmware' child node should be added to the QE - node. This node provides information on the uploaded firmware that - device drivers may need. - - Required properties: - - id: The string name of the firmware. This is taken from the 'id' - member of the qe_firmware structure of the uploaded firmware. - Device drivers can search this string to determine if the - firmware they want is already present. - - extended-modes: The Extended Modes bitfield, taken from the - firmware binary. It is a 64-bit number represented - as an array of two 32-bit numbers. - - virtual-traps: The virtual traps, taken from the firmware binary. - It is an array of 8 32-bit numbers. - -Example: - firmware { - id = "Soft-UART"; - extended-modes = <0 0>; - virtual-traps = <0 0 0 0 0 0 0 0>; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe/par_io.txt b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe/par_io.txt deleted file mode 100644 index 60984260207b..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe/par_io.txt +++ /dev/null @@ -1,51 +0,0 @@ -* Parallel I/O Ports - -This node configures Parallel I/O ports for CPUs with QE support. -The node should reside in the "soc" node of the tree. For each -device that using parallel I/O ports, a child node should be created. -See the definition of the Pin configuration nodes below for more -information. - -Required properties: -- device_type : should be "par_io". -- reg : offset to the register set and its length. -- num-ports : number of Parallel I/O ports - -Example: -par_io@1400 { - reg = <1400 100>; - #address-cells = <1>; - #size-cells = <0>; - device_type = "par_io"; - num-ports = <7>; - ucc_pin@01 { - ...... - }; - -Note that "par_io" nodes are obsolete, and should not be used for -the new device trees. Instead, each Par I/O bank should be represented -via its own gpio-controller node: - -Required properties: -- #gpio-cells : should be "2". -- compatible : should be "fsl,-qe-pario-bank", - "fsl,mpc8323-qe-pario-bank". -- reg : offset to the register set and its length. -- gpio-controller : node to identify gpio controllers. - -Example: - qe_pio_a: gpio-controller@1400 { - #gpio-cells = <2>; - compatible = "fsl,mpc8360-qe-pario-bank", - "fsl,mpc8323-qe-pario-bank"; - reg = <0x1400 0x18>; - gpio-controller; - }; - - qe_pio_e: gpio-controller@1460 { - #gpio-cells = <2>; - compatible = "fsl,mpc8360-qe-pario-bank", - "fsl,mpc8323-qe-pario-bank"; - reg = <0x1460 0x18>; - gpio-controller; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe/pincfg.txt b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe/pincfg.txt deleted file mode 100644 index c5b43061db3a..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe/pincfg.txt +++ /dev/null @@ -1,60 +0,0 @@ -* Pin configuration nodes - -Required properties: -- linux,phandle : phandle of this node; likely referenced by a QE - device. -- pio-map : array of pin configurations. Each pin is defined by 6 - integers. The six numbers are respectively: port, pin, dir, - open_drain, assignment, has_irq. - - port : port number of the pin; 0-6 represent port A-G in UM. - - pin : pin number in the port. - - dir : direction of the pin, should encode as follows: - - 0 = The pin is disabled - 1 = The pin is an output - 2 = The pin is an input - 3 = The pin is I/O - - - open_drain : indicates the pin is normal or wired-OR: - - 0 = The pin is actively driven as an output - 1 = The pin is an open-drain driver. As an output, the pin is - driven active-low, otherwise it is three-stated. - - - assignment : function number of the pin according to the Pin Assignment - tables in User Manual. Each pin can have up to 4 possible functions in - QE and two options for CPM. - - has_irq : indicates if the pin is used as source of external - interrupts. - -Example: - ucc_pin@01 { - linux,phandle = <140001>; - pio-map = < - /* port pin dir open_drain assignment has_irq */ - 0 3 1 0 1 0 /* TxD0 */ - 0 4 1 0 1 0 /* TxD1 */ - 0 5 1 0 1 0 /* TxD2 */ - 0 6 1 0 1 0 /* TxD3 */ - 1 6 1 0 3 0 /* TxD4 */ - 1 7 1 0 1 0 /* TxD5 */ - 1 9 1 0 2 0 /* TxD6 */ - 1 a 1 0 2 0 /* TxD7 */ - 0 9 2 0 1 0 /* RxD0 */ - 0 a 2 0 1 0 /* RxD1 */ - 0 b 2 0 1 0 /* RxD2 */ - 0 c 2 0 1 0 /* RxD3 */ - 0 d 2 0 1 0 /* RxD4 */ - 1 1 2 0 2 0 /* RxD5 */ - 1 0 2 0 2 0 /* RxD6 */ - 1 4 2 0 2 0 /* RxD7 */ - 0 7 1 0 1 0 /* TX_EN */ - 0 8 1 0 1 0 /* TX_ER */ - 0 f 2 0 1 0 /* RX_DV */ - 0 10 2 0 1 0 /* RX_ER */ - 0 0 2 0 1 0 /* RX_CLK */ - 2 9 1 0 3 0 /* GTX_CLK - CLK10 */ - 2 8 2 0 1 0>; /* GTX125 - CLK9 */ - }; - - diff --git a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe/ucc.txt b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe/ucc.txt deleted file mode 100644 index e47734bee3f0..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe/ucc.txt +++ /dev/null @@ -1,70 +0,0 @@ -* UCC (Unified Communications Controllers) - -Required properties: -- device_type : should be "network", "hldc", "uart", "transparent" - "bisync", "atm", or "serial". -- compatible : could be "ucc_geth" or "fsl_atm" and so on. -- cell-index : the ucc number(1-8), corresponding to UCCx in UM. -- reg : Offset and length of the register set for the device -- interrupts : where a is the interrupt number and b is a - field that represents an encoding of the sense and level - information for the interrupt. This should be encoded based on - the information in section 2) depending on the type of interrupt - controller you have. -- interrupt-parent : the phandle for the interrupt controller that - services interrupts for this device. -- pio-handle : The phandle for the Parallel I/O port configuration. -- port-number : for UART drivers, the port number to use, between 0 and 3. - This usually corresponds to the /dev/ttyQE device, e.g. <0> = /dev/ttyQE0. - The port number is added to the minor number of the device. Unlike the - CPM UART driver, the port-number is required for the QE UART driver. -- soft-uart : for UART drivers, if specified this means the QE UART device - driver should use "Soft-UART" mode, which is needed on some SOCs that have - broken UART hardware. Soft-UART is provided via a microcode upload. -- rx-clock-name: the UCC receive clock source - "none": clock source is disabled - "brg1" through "brg16": clock source is BRG1-BRG16, respectively - "clk1" through "clk24": clock source is CLK1-CLK24, respectively -- tx-clock-name: the UCC transmit clock source - "none": clock source is disabled - "brg1" through "brg16": clock source is BRG1-BRG16, respectively - "clk1" through "clk24": clock source is CLK1-CLK24, respectively -The following two properties are deprecated. rx-clock has been replaced -with rx-clock-name, and tx-clock has been replaced with tx-clock-name. -Drivers that currently use the deprecated properties should continue to -do so, in order to support older device trees, but they should be updated -to check for the new properties first. -- rx-clock : represents the UCC receive clock source. - 0x00 : clock source is disabled; - 0x1~0x10 : clock source is BRG1~BRG16 respectively; - 0x11~0x28: clock source is QE_CLK1~QE_CLK24 respectively. -- tx-clock: represents the UCC transmit clock source; - 0x00 : clock source is disabled; - 0x1~0x10 : clock source is BRG1~BRG16 respectively; - 0x11~0x28: clock source is QE_CLK1~QE_CLK24 respectively. - -Required properties for network device_type: -- mac-address : list of bytes representing the ethernet address. -- phy-handle : The phandle for the PHY connected to this controller. - -Recommended properties: -- phy-connection-type : a string naming the controller/PHY interface type, - i.e., "mii" (default), "rmii", "gmii", "rgmii", "rgmii-id" (Internal - Delay), "rgmii-txid" (delay on TX only), "rgmii-rxid" (delay on RX only), - "tbi", or "rtbi". - -Example: - ucc@2000 { - device_type = "network"; - compatible = "ucc_geth"; - cell-index = <1>; - reg = <2000 200>; - interrupts = ; - interrupt-parent = <700>; - mac-address = [ 00 04 9f 00 23 23 ]; - rx-clock = "none"; - tx-clock = "clk9"; - phy-handle = <212000>; - phy-connection-type = "gmii"; - pio-handle = <140001>; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe/usb.txt b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe/usb.txt deleted file mode 100644 index 9ccd5f30405b..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe/usb.txt +++ /dev/null @@ -1,37 +0,0 @@ -Freescale QUICC Engine USB Controller - -Required properties: -- compatible : should be "fsl,-qe-usb", "fsl,mpc8323-qe-usb". -- reg : the first two cells should contain usb registers location and - length, the next two two cells should contain PRAM location and - length. -- interrupts : should contain USB interrupt. -- interrupt-parent : interrupt source phandle. -- fsl,fullspeed-clock : specifies the full speed USB clock source: - "none": clock source is disabled - "brg1" through "brg16": clock source is BRG1-BRG16, respectively - "clk1" through "clk24": clock source is CLK1-CLK24, respectively -- fsl,lowspeed-clock : specifies the low speed USB clock source: - "none": clock source is disabled - "brg1" through "brg16": clock source is BRG1-BRG16, respectively - "clk1" through "clk24": clock source is CLK1-CLK24, respectively -- hub-power-budget : USB power budget for the root hub, in mA. -- gpios : should specify GPIOs in this order: USBOE, USBTP, USBTN, USBRP, - USBRN, SPEED (optional), and POWER (optional). - -Example: - -usb@6c0 { - compatible = "fsl,mpc8360-qe-usb", "fsl,mpc8323-qe-usb"; - reg = <0x6c0 0x40 0x8b00 0x100>; - interrupts = <11>; - interrupt-parent = <&qeic>; - fsl,fullspeed-clock = "clk21"; - gpios = <&qe_pio_b 2 0 /* USBOE */ - &qe_pio_b 3 0 /* USBTP */ - &qe_pio_b 8 0 /* USBTN */ - &qe_pio_b 9 0 /* USBRP */ - &qe_pio_b 11 0 /* USBRN */ - &qe_pio_e 20 0 /* SPEED */ - &qe_pio_e 21 0 /* POWER */>; -}; diff --git a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/serial.txt b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/serial.txt deleted file mode 100644 index 2ea76d9d137c..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/serial.txt +++ /dev/null @@ -1,32 +0,0 @@ -* Serial - -Currently defined compatibles: -- fsl,cpm1-smc-uart -- fsl,cpm2-smc-uart -- fsl,cpm1-scc-uart -- fsl,cpm2-scc-uart -- fsl,qe-uart - -Modem control lines connected to GPIO controllers are listed in the gpios -property as described in booting-without-of.txt, section IX.1 in the following -order: - -CTS, RTS, DCD, DSR, DTR, and RI. - -The gpios property is optional and can be left out when control lines are -not used. - -Example: - - serial@11a00 { - device_type = "serial"; - compatible = "fsl,mpc8272-scc-uart", - "fsl,cpm2-scc-uart"; - reg = <11a00 20 8000 100>; - interrupts = <28 8>; - interrupt-parent = <&PIC>; - fsl,cpm-brg = <1>; - fsl,cpm-command = <00800000>; - gpios = <&gpio_c 15 0 - &gpio_d 29 0>; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/diu.txt b/Documentation/powerpc/dts-bindings/fsl/diu.txt deleted file mode 100644 index b66cb6d31d69..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/diu.txt +++ /dev/null @@ -1,34 +0,0 @@ -* Freescale Display Interface Unit - -The Freescale DIU is a LCD controller, with proper hardware, it can also -drive DVI monitors. - -Required properties: -- compatible : should be "fsl,diu" or "fsl,mpc5121-diu". -- reg : should contain at least address and length of the DIU register - set. -- interrupts : one DIU interrupt should be described here. -- interrupt-parent : the phandle for the interrupt controller that - services interrupts for this device. - -Optional properties: -- edid : verbatim EDID data block describing attached display. - Data from the detailed timing descriptor will be used to - program the display controller. - -Example (MPC8610HPCD): - display@2c000 { - compatible = "fsl,diu"; - reg = <0x2c000 100>; - interrupts = <72 2>; - interrupt-parent = <&mpic>; - }; - -Example for MPC5121: - display@2100 { - compatible = "fsl,mpc5121-diu"; - reg = <0x2100 0x100>; - interrupts = <64 0x8>; - interrupt-parent = <&ipic>; - edid = [edid-data]; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/dma.txt b/Documentation/powerpc/dts-bindings/fsl/dma.txt deleted file mode 100644 index 2a4b4bce6110..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/dma.txt +++ /dev/null @@ -1,144 +0,0 @@ -* Freescale 83xx DMA Controller - -Freescale PowerPC 83xx have on chip general purpose DMA controllers. - -Required properties: - -- compatible : compatible list, contains 2 entries, first is - "fsl,CHIP-dma", where CHIP is the processor - (mpc8349, mpc8360, etc.) and the second is - "fsl,elo-dma" -- reg : -- ranges : Should be defined as specified in 1) to describe the - DMA controller channels. -- cell-index : controller index. 0 for controller @ 0x8100 -- interrupts : -- interrupt-parent : optional, if needed for interrupt mapping - - -- DMA channel nodes: - - compatible : compatible list, contains 2 entries, first is - "fsl,CHIP-dma-channel", where CHIP is the processor - (mpc8349, mpc8350, etc.) and the second is - "fsl,elo-dma-channel". However, see note below. - - reg : - - cell-index : dma channel index starts at 0. - -Optional properties: - - interrupts : - (on 83xx this is expected to be identical to - the interrupts property of the parent node) - - interrupt-parent : optional, if needed for interrupt mapping - -Example: - dma@82a8 { - #address-cells = <1>; - #size-cells = <1>; - compatible = "fsl,mpc8349-dma", "fsl,elo-dma"; - reg = <0x82a8 4>; - ranges = <0 0x8100 0x1a4>; - interrupt-parent = <&ipic>; - interrupts = <71 8>; - cell-index = <0>; - dma-channel@0 { - compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel"; - cell-index = <0>; - reg = <0 0x80>; - interrupt-parent = <&ipic>; - interrupts = <71 8>; - }; - dma-channel@80 { - compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel"; - cell-index = <1>; - reg = <0x80 0x80>; - interrupt-parent = <&ipic>; - interrupts = <71 8>; - }; - dma-channel@100 { - compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel"; - cell-index = <2>; - reg = <0x100 0x80>; - interrupt-parent = <&ipic>; - interrupts = <71 8>; - }; - dma-channel@180 { - compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel"; - cell-index = <3>; - reg = <0x180 0x80>; - interrupt-parent = <&ipic>; - interrupts = <71 8>; - }; - }; - -* Freescale 85xx/86xx DMA Controller - -Freescale PowerPC 85xx/86xx have on chip general purpose DMA controllers. - -Required properties: - -- compatible : compatible list, contains 2 entries, first is - "fsl,CHIP-dma", where CHIP is the processor - (mpc8540, mpc8540, etc.) and the second is - "fsl,eloplus-dma" -- reg : -- cell-index : controller index. 0 for controller @ 0x21000, - 1 for controller @ 0xc000 -- ranges : Should be defined as specified in 1) to describe the - DMA controller channels. - -- DMA channel nodes: - - compatible : compatible list, contains 2 entries, first is - "fsl,CHIP-dma-channel", where CHIP is the processor - (mpc8540, mpc8560, etc.) and the second is - "fsl,eloplus-dma-channel". However, see note below. - - cell-index : dma channel index starts at 0. - - reg : - - interrupts : - - interrupt-parent : optional, if needed for interrupt mapping - -Example: - dma@21300 { - #address-cells = <1>; - #size-cells = <1>; - compatible = "fsl,mpc8540-dma", "fsl,eloplus-dma"; - reg = <0x21300 4>; - ranges = <0 0x21100 0x200>; - cell-index = <0>; - dma-channel@0 { - compatible = "fsl,mpc8540-dma-channel", "fsl,eloplus-dma-channel"; - reg = <0 0x80>; - cell-index = <0>; - interrupt-parent = <&mpic>; - interrupts = <20 2>; - }; - dma-channel@80 { - compatible = "fsl,mpc8540-dma-channel", "fsl,eloplus-dma-channel"; - reg = <0x80 0x80>; - cell-index = <1>; - interrupt-parent = <&mpic>; - interrupts = <21 2>; - }; - dma-channel@100 { - compatible = "fsl,mpc8540-dma-channel", "fsl,eloplus-dma-channel"; - reg = <0x100 0x80>; - cell-index = <2>; - interrupt-parent = <&mpic>; - interrupts = <22 2>; - }; - dma-channel@180 { - compatible = "fsl,mpc8540-dma-channel", "fsl,eloplus-dma-channel"; - reg = <0x180 0x80>; - cell-index = <3>; - interrupt-parent = <&mpic>; - interrupts = <23 2>; - }; - }; - -Note on DMA channel compatible properties: The compatible property must say -"fsl,elo-dma-channel" or "fsl,eloplus-dma-channel" to be used by the Elo DMA -driver (fsldma). Any DMA channel used by fsldma cannot be used by another -DMA driver, such as the SSI sound drivers for the MPC8610. Therefore, any DMA -channel that should be used for another driver should not use -"fsl,elo-dma-channel" or "fsl,eloplus-dma-channel". For the SSI drivers, for -example, the compatible property should be "fsl,ssi-dma-channel". See ssi.txt -for more information. diff --git a/Documentation/powerpc/dts-bindings/fsl/esdhc.txt b/Documentation/powerpc/dts-bindings/fsl/esdhc.txt deleted file mode 100644 index 64bcb8be973c..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/esdhc.txt +++ /dev/null @@ -1,29 +0,0 @@ -* Freescale Enhanced Secure Digital Host Controller (eSDHC) - -The Enhanced Secure Digital Host Controller provides an interface -for MMC, SD, and SDIO types of memory cards. - -Required properties: - - compatible : should be - "fsl,-esdhc", "fsl,esdhc" - - reg : should contain eSDHC registers location and length. - - interrupts : should contain eSDHC interrupt. - - interrupt-parent : interrupt source phandle. - - clock-frequency : specifies eSDHC base clock frequency. - - sdhci,wp-inverted : (optional) specifies that eSDHC controller - reports inverted write-protect state; - - sdhci,1-bit-only : (optional) specifies that a controller can - only handle 1-bit data transfers. - - sdhci,auto-cmd12: (optional) specifies that a controller can - only handle auto CMD12. - -Example: - -sdhci@2e000 { - compatible = "fsl,mpc8378-esdhc", "fsl,esdhc"; - reg = <0x2e000 0x1000>; - interrupts = <42 0x8>; - interrupt-parent = <&ipic>; - /* Filled in by U-Boot */ - clock-frequency = <0>; -}; diff --git a/Documentation/powerpc/dts-bindings/fsl/gtm.txt b/Documentation/powerpc/dts-bindings/fsl/gtm.txt deleted file mode 100644 index 9a33efded4bc..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/gtm.txt +++ /dev/null @@ -1,31 +0,0 @@ -* Freescale General-purpose Timers Module - -Required properties: - - compatible : should be - "fsl,-gtm", "fsl,gtm" for SOC GTMs - "fsl,-qe-gtm", "fsl,qe-gtm", "fsl,gtm" for QE GTMs - "fsl,-cpm2-gtm", "fsl,cpm2-gtm", "fsl,gtm" for CPM2 GTMs - - reg : should contain gtm registers location and length (0x40). - - interrupts : should contain four interrupts. - - interrupt-parent : interrupt source phandle. - - clock-frequency : specifies the frequency driving the timer. - -Example: - -timer@500 { - compatible = "fsl,mpc8360-gtm", "fsl,gtm"; - reg = <0x500 0x40>; - interrupts = <90 8 78 8 84 8 72 8>; - interrupt-parent = <&ipic>; - /* filled by u-boot */ - clock-frequency = <0>; -}; - -timer@440 { - compatible = "fsl,mpc8360-qe-gtm", "fsl,qe-gtm", "fsl,gtm"; - reg = <0x440 0x40>; - interrupts = <12 13 14 15>; - interrupt-parent = <&qeic>; - /* filled by u-boot */ - clock-frequency = <0>; -}; diff --git a/Documentation/powerpc/dts-bindings/fsl/guts.txt b/Documentation/powerpc/dts-bindings/fsl/guts.txt deleted file mode 100644 index 9e7a2417dac5..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/guts.txt +++ /dev/null @@ -1,25 +0,0 @@ -* Global Utilities Block - -The global utilities block controls power management, I/O device -enabling, power-on-reset configuration monitoring, general-purpose -I/O signal configuration, alternate function selection for multiplexed -signals, and clock control. - -Required properties: - - - compatible : Should define the compatible device type for - global-utilities. - - reg : Offset and length of the register set for the device. - -Recommended properties: - - - fsl,has-rstcr : Indicates that the global utilities register set - contains a functioning "reset control register" (i.e. the board - is wired to reset upon setting the HRESET_REQ bit in this register). - -Example: - global-utilities@e0000 { /* global utilities block */ - compatible = "fsl,mpc8548-guts"; - reg = ; - fsl,has-rstcr; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/i2c.txt b/Documentation/powerpc/dts-bindings/fsl/i2c.txt deleted file mode 100644 index 1eacd6b20ed5..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/i2c.txt +++ /dev/null @@ -1,64 +0,0 @@ -* I2C - -Required properties : - - - reg : Offset and length of the register set for the device - - compatible : should be "fsl,CHIP-i2c" where CHIP is the name of a - compatible processor, e.g. mpc8313, mpc8543, mpc8544, mpc5121, - mpc5200 or mpc5200b. For the mpc5121, an additional node - "fsl,mpc5121-i2c-ctrl" is required as shown in the example below. - -Recommended properties : - - - interrupts : where a is the interrupt number and b is a - field that represents an encoding of the sense and level - information for the interrupt. This should be encoded based on - the information in section 2) depending on the type of interrupt - controller you have. - - interrupt-parent : the phandle for the interrupt controller that - services interrupts for this device. - - fsl,preserve-clocking : boolean; if defined, the clock settings - from the bootloader are preserved (not touched). - - clock-frequency : desired I2C bus clock frequency in Hz. - - fsl,timeout : I2C bus timeout in microseconds. - -Examples : - - /* MPC5121 based board */ - i2c@1740 { - #address-cells = <1>; - #size-cells = <0>; - compatible = "fsl,mpc5121-i2c", "fsl-i2c"; - reg = <0x1740 0x20>; - interrupts = <11 0x8>; - interrupt-parent = <&ipic>; - clock-frequency = <100000>; - }; - - i2ccontrol@1760 { - compatible = "fsl,mpc5121-i2c-ctrl"; - reg = <0x1760 0x8>; - }; - - /* MPC5200B based board */ - i2c@3d00 { - #address-cells = <1>; - #size-cells = <0>; - compatible = "fsl,mpc5200b-i2c","fsl,mpc5200-i2c","fsl-i2c"; - reg = <0x3d00 0x40>; - interrupts = <2 15 0>; - interrupt-parent = <&mpc5200_pic>; - fsl,preserve-clocking; - }; - - /* MPC8544 base board */ - i2c@3100 { - #address-cells = <1>; - #size-cells = <0>; - compatible = "fsl,mpc8544-i2c", "fsl-i2c"; - reg = <0x3100 0x100>; - interrupts = <43 2>; - interrupt-parent = <&mpic>; - clock-frequency = <400000>; - fsl,timeout = <10000>; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/lbc.txt b/Documentation/powerpc/dts-bindings/fsl/lbc.txt deleted file mode 100644 index 3300fec501c5..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/lbc.txt +++ /dev/null @@ -1,35 +0,0 @@ -* Chipselect/Local Bus - -Properties: -- name : Should be localbus -- #address-cells : Should be either two or three. The first cell is the - chipselect number, and the remaining cells are the - offset into the chipselect. -- #size-cells : Either one or two, depending on how large each chipselect - can be. -- ranges : Each range corresponds to a single chipselect, and cover - the entire access window as configured. - -Example: - localbus@f0010100 { - compatible = "fsl,mpc8272-localbus", - "fsl,pq2-localbus"; - #address-cells = <2>; - #size-cells = <1>; - reg = ; - - ranges = <0 0 fe000000 02000000 - 1 0 f4500000 00008000>; - - flash@0,0 { - compatible = "jedec-flash"; - reg = <0 0 2000000>; - bank-width = <4>; - device-width = <1>; - }; - - board-control@1,0 { - reg = <1 0 20>; - compatible = "fsl,mpc8272ads-bcsr"; - }; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/mcm.txt b/Documentation/powerpc/dts-bindings/fsl/mcm.txt deleted file mode 100644 index 4ceda9b3b413..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/mcm.txt +++ /dev/null @@ -1,64 +0,0 @@ -===================================================================== -MPX LAW & Coherency Module Device Tree Binding -Copyright (C) 2009 Freescale Semiconductor Inc. -===================================================================== - -Local Access Window (LAW) Node - -The LAW node represents the region of CCSR space where local access -windows are configured. For MCM based devices this is the first 4k -of CCSR space that includes CCSRBAR, ALTCBAR, ALTCAR, BPTR, and some -number of local access windows as specified by fsl,num-laws. - -PROPERTIES - - - compatible - Usage: required - Value type: - Definition: Must include "fsl,mcm-law" - - - reg - Usage: required - Value type: - Definition: A standard property. The value specifies the - physical address offset and length of the CCSR space - registers. - - - fsl,num-laws - Usage: required - Value type: - Definition: The value specifies the number of local access - windows for this device. - -===================================================================== - -MPX Coherency Module Node - -The MPX LAW node represents the region of CCSR space where MCM config -and error reporting registers exist, this is the second 4k (0x1000) -of CCSR space. - -PROPERTIES - - - compatible - Usage: required - Value type: - Definition: Must include "fsl,CHIP-mcm", "fsl,mcm" where - CHIP is the processor (mpc8641, mpc8610, etc.) - - - reg - Usage: required - Value type: - Definition: A standard property. The value specifies the - physical address offset and length of the CCSR space - registers. - - - interrupts - Usage: required - Value type: - - - interrupt-parent - Usage: required - Value type: - -===================================================================== diff --git a/Documentation/powerpc/dts-bindings/fsl/mcu-mpc8349emitx.txt b/Documentation/powerpc/dts-bindings/fsl/mcu-mpc8349emitx.txt deleted file mode 100644 index 0f766333b6eb..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/mcu-mpc8349emitx.txt +++ /dev/null @@ -1,17 +0,0 @@ -Freescale MPC8349E-mITX-compatible Power Management Micro Controller Unit (MCU) - -Required properties: -- compatible : "fsl,-", "fsl,mcu-mpc8349emitx". -- reg : should specify I2C address (0x0a). -- #gpio-cells : should be 2. -- gpio-controller : should be present. - -Example: - -mcu@0a { - #gpio-cells = <2>; - compatible = "fsl,mc9s08qg8-mpc8349emitx", - "fsl,mcu-mpc8349emitx"; - reg = <0x0a>; - gpio-controller; -}; diff --git a/Documentation/powerpc/dts-bindings/fsl/mpc5121-psc.txt b/Documentation/powerpc/dts-bindings/fsl/mpc5121-psc.txt deleted file mode 100644 index 8832e8798912..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/mpc5121-psc.txt +++ /dev/null @@ -1,70 +0,0 @@ -MPC5121 PSC Device Tree Bindings - -PSC in UART mode ----------------- - -For PSC in UART mode the needed PSC serial devices -are specified by fsl,mpc5121-psc-uart nodes in the -fsl,mpc5121-immr SoC node. Additionally the PSC FIFO -Controller node fsl,mpc5121-psc-fifo is requered there: - -fsl,mpc5121-psc-uart nodes --------------------------- - -Required properties : - - compatible : Should contain "fsl,mpc5121-psc-uart" and "fsl,mpc5121-psc" - - cell-index : Index of the PSC in hardware - - reg : Offset and length of the register set for the PSC device - - interrupts : where a is the interrupt number of the - PSC FIFO Controller and b is a field that represents an - encoding of the sense and level information for the interrupt. - - interrupt-parent : the phandle for the interrupt controller that - services interrupts for this device. - -Recommended properties : - - fsl,rx-fifo-size : the size of the RX fifo slice (a multiple of 4) - - fsl,tx-fifo-size : the size of the TX fifo slice (a multiple of 4) - - -fsl,mpc5121-psc-fifo node -------------------------- - -Required properties : - - compatible : Should be "fsl,mpc5121-psc-fifo" - - reg : Offset and length of the register set for the PSC - FIFO Controller - - interrupts : where a is the interrupt number of the - PSC FIFO Controller and b is a field that represents an - encoding of the sense and level information for the interrupt. - - interrupt-parent : the phandle for the interrupt controller that - services interrupts for this device. - - -Example for a board using PSC0 and PSC1 devices in serial mode: - -serial@11000 { - compatible = "fsl,mpc5121-psc-uart", "fsl,mpc5121-psc"; - cell-index = <0>; - reg = <0x11000 0x100>; - interrupts = <40 0x8>; - interrupt-parent = < &ipic >; - fsl,rx-fifo-size = <16>; - fsl,tx-fifo-size = <16>; -}; - -serial@11100 { - compatible = "fsl,mpc5121-psc-uart", "fsl,mpc5121-psc"; - cell-index = <1>; - reg = <0x11100 0x100>; - interrupts = <40 0x8>; - interrupt-parent = < &ipic >; - fsl,rx-fifo-size = <16>; - fsl,tx-fifo-size = <16>; -}; - -pscfifo@11f00 { - compatible = "fsl,mpc5121-psc-fifo"; - reg = <0x11f00 0x100>; - interrupts = <40 0x8>; - interrupt-parent = < &ipic >; -}; diff --git a/Documentation/powerpc/dts-bindings/fsl/mpc5200.txt b/Documentation/powerpc/dts-bindings/fsl/mpc5200.txt deleted file mode 100644 index 4ccb2cd5df94..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/mpc5200.txt +++ /dev/null @@ -1,198 +0,0 @@ -MPC5200 Device Tree Bindings ----------------------------- - -(c) 2006-2009 Secret Lab Technologies Ltd -Grant Likely - -Naming conventions ------------------- -For mpc5200 on-chip devices, the format for each compatible value is --[-]. The OS should be able to match a device driver -to the device based solely on the compatible value. If two drivers -match on the compatible list; the 'most compatible' driver should be -selected. - -The split between the MPC5200 and the MPC5200B leaves a bit of a -conundrum. How should the compatible property be set up to provide -maximum compatibility information; but still accurately describe the -chip? For the MPC5200; the answer is easy. Most of the SoC devices -originally appeared on the MPC5200. Since they didn't exist anywhere -else; the 5200 compatible properties will contain only one item; -"fsl,mpc5200-". - -The 5200B is almost the same as the 5200, but not quite. It fixes -silicon bugs and it adds a small number of enhancements. Most of the -devices either provide exactly the same interface as on the 5200. A few -devices have extra functions but still have a backwards compatible mode. -To express this information as completely as possible, 5200B device trees -should have two items in the compatible list: - compatible = "fsl,mpc5200b-","fsl,mpc5200-"; - -It is *strongly* recommended that 5200B device trees follow this convention -(instead of only listing the base mpc5200 item). - -ie. ethernet on mpc5200: compatible = "fsl,mpc5200-fec"; - ethernet on mpc5200b: compatible = "fsl,mpc5200b-fec", "fsl,mpc5200-fec"; - -Modal devices, like PSCs, also append the configured function to the -end of the compatible field. ie. A PSC in i2s mode would specify -"fsl,mpc5200-psc-i2s", not "fsl,mpc5200-i2s". This convention is chosen to -avoid naming conflicts with non-psc devices providing the same -function. For example, "fsl,mpc5200-spi" and "fsl,mpc5200-psc-spi" describe -the mpc5200 simple spi device and a PSC spi mode respectively. - -At the time of writing, exact chip may be either 'fsl,mpc5200' or -'fsl,mpc5200b'. - -The soc node ------------- -This node describes the on chip SOC peripherals. Every mpc5200 based -board will have this node, and as such there is a common naming -convention for SOC devices. - -Required properties: -name description ----- ----------- -ranges Memory range of the internal memory mapped registers. - Should be <0 [baseaddr] 0xc000> -reg Should be <[baseaddr] 0x100> -compatible mpc5200: "fsl,mpc5200-immr" - mpc5200b: "fsl,mpc5200b-immr" -system-frequency 'fsystem' frequency in Hz; XLB, IPB, USB and PCI - clocks are derived from the fsystem clock. -bus-frequency IPB bus frequency in Hz. Clock rate - used by most of the soc devices. - -soc child nodes ---------------- -Any on chip SOC devices available to Linux must appear as soc5200 child nodes. - -Note: The tables below show the value for the mpc5200. A mpc5200b device -tree should use the "fsl,mpc5200b-","fsl,mpc5200-" form. - -Required soc5200 child nodes: -name compatible Description ----- ---------- ----------- -cdm@ fsl,mpc5200-cdm Clock Distribution -interrupt-controller@ fsl,mpc5200-pic need an interrupt - controller to boot -bestcomm@ fsl,mpc5200-bestcomm Bestcomm DMA controller - -Recommended soc5200 child nodes; populate as needed for your board -name compatible Description ----- ---------- ----------- -timer@ fsl,mpc5200-gpt General purpose timers -gpio@ fsl,mpc5200-gpio MPC5200 simple gpio controller -gpio@ fsl,mpc5200-gpio-wkup MPC5200 wakeup gpio controller -rtc@ fsl,mpc5200-rtc Real time clock -mscan@ fsl,mpc5200-mscan CAN bus controller -pci@ fsl,mpc5200-pci PCI bridge -serial@ fsl,mpc5200-psc-uart PSC in serial mode -i2s@ fsl,mpc5200-psc-i2s PSC in i2s mode -ac97@ fsl,mpc5200-psc-ac97 PSC in ac97 mode -spi@ fsl,mpc5200-psc-spi PSC in spi mode -irda@ fsl,mpc5200-psc-irda PSC in IrDA mode -spi@ fsl,mpc5200-spi MPC5200 spi device -ethernet@ fsl,mpc5200-fec MPC5200 ethernet device -ata@ fsl,mpc5200-ata IDE ATA interface -i2c@ fsl,mpc5200-i2c I2C controller -usb@ fsl,mpc5200-ohci,ohci-be USB controller -xlb@ fsl,mpc5200-xlb XLB arbitrator - -fsl,mpc5200-gpt nodes ---------------------- -On the mpc5200 and 5200b, GPT0 has a watchdog timer function. If the board -design supports the internal wdt, then the device node for GPT0 should -include the empty property 'fsl,has-wdt'. Note that this does not activate -the watchdog. The timer will function as a GPT if the timer api is used, and -it will function as watchdog if the watchdog device is used. The watchdog -mode has priority over the gpt mode, i.e. if the watchdog is activated, any -gpt api call to this timer will fail with -EBUSY. - -If you add the property - fsl,wdt-on-boot = ; -GPT0 will be marked as in-use watchdog, i.e. blocking every gpt access to it. -If n>0, the watchdog is started with a timeout of n seconds. If n=0, the -configuration of the watchdog is not touched. This is useful in two cases: -- just mark GPT0 as watchdog, blocking gpt accesses, and configure it later; -- do not touch a configuration assigned by the boot loader which supervises - the boot process itself. - -The watchdog will respect the CONFIG_WATCHDOG_NOWAYOUT option. - -An mpc5200-gpt can be used as a single line GPIO controller. To do so, -add the following properties to the gpt node: - gpio-controller; - #gpio-cells = <2>; -When referencing the GPIO line from another node, the first cell must always -be zero and the second cell represents the gpio flags and described in the -gpio device tree binding. - -An mpc5200-gpt can be used as a single line edge sensitive interrupt -controller. To do so, add the following properties to the gpt node: - interrupt-controller; - #interrupt-cells = <1>; -When referencing the IRQ line from another node, the cell represents the -sense mode; 1 for edge rising, 2 for edge falling. - -fsl,mpc5200-psc nodes ---------------------- -The PSCs should include a cell-index which is the index of the PSC in -hardware. cell-index is used to determine which shared SoC registers to -use when setting up PSC clocking. cell-index number starts at '0'. ie: - PSC1 has 'cell-index = <0>' - PSC4 has 'cell-index = <3>' - -PSC in i2s mode: The mpc5200 and mpc5200b PSCs are not compatible when in -i2s mode. An 'mpc5200b-psc-i2s' node cannot include 'mpc5200-psc-i2s' in the -compatible field. - - -fsl,mpc5200-gpio and fsl,mpc5200-gpio-wkup nodes ------------------------------------------------- -Each GPIO controller node should have the empty property gpio-controller and -#gpio-cells set to 2. First cell is the GPIO number which is interpreted -according to the bit numbers in the GPIO control registers. The second cell -is for flags which is currently unused. - -fsl,mpc5200-fec nodes ---------------------- -The FEC node can specify one of the following properties to configure -the MII link: -- fsl,7-wire-mode - An empty property that specifies the link uses 7-wire - mode instead of MII -- current-speed - Specifies that the MII should be configured for a fixed - speed. This property should contain two cells. The - first cell specifies the speed in Mbps and the second - should be '0' for half duplex and '1' for full duplex -- phy-handle - Contains a phandle to an Ethernet PHY. - -Interrupt controller (fsl,mpc5200-pic) node -------------------------------------------- -The mpc5200 pic binding splits hardware IRQ numbers into two levels. The -split reflects the layout of the PIC hardware itself, which groups -interrupts into one of three groups; CRIT, MAIN or PERP. Also, the -Bestcomm dma engine has it's own set of interrupt sources which are -cascaded off of peripheral interrupt 0, which the driver interprets as a -fourth group, SDMA. - -The interrupts property for device nodes using the mpc5200 pic consists -of three cells; - - L1 := [CRIT=0, MAIN=1, PERP=2, SDMA=3] - L2 := interrupt number; directly mapped from the value in the - "ICTL PerStat, MainStat, CritStat Encoded Register" - level := [LEVEL_HIGH=0, EDGE_RISING=1, EDGE_FALLING=2, LEVEL_LOW=3] - -For external IRQs, use the following interrupt property values (how to -specify external interrupts is a frequently asked question): -External interrupts: - external irq0: interrupts = <0 0 n>; - external irq1: interrupts = <1 1 n>; - external irq2: interrupts = <1 2 n>; - external irq3: interrupts = <1 3 n>; -'n' is sense (0: level high, 1: edge rising, 2: edge falling 3: level low) - -fsl,mpc5200-mscan nodes ------------------------ -See file can.txt in this directory. diff --git a/Documentation/powerpc/dts-bindings/fsl/mpic.txt b/Documentation/powerpc/dts-bindings/fsl/mpic.txt deleted file mode 100644 index 71e39cf3215b..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/mpic.txt +++ /dev/null @@ -1,42 +0,0 @@ -* OpenPIC and its interrupt numbers on Freescale's e500/e600 cores - -The OpenPIC specification does not specify which interrupt source has to -become which interrupt number. This is up to the software implementation -of the interrupt controller. The only requirement is that every -interrupt source has to have an unique interrupt number / vector number. -To accomplish this the current implementation assigns the number zero to -the first source, the number one to the second source and so on until -all interrupt sources have their unique number. -Usually the assigned vector number equals the interrupt number mentioned -in the documentation for a given core / CPU. This is however not true -for the e500 cores (MPC85XX CPUs) where the documentation distinguishes -between internal and external interrupt sources and starts counting at -zero for both of them. - -So what to write for external interrupt source X or internal interrupt -source Y into the device tree? Here is an example: - -The memory map for the interrupt controller in the MPC8544[0] shows, -that the first interrupt source starts at 0x5_0000 (PIC Register Address -Map-Interrupt Source Configuration Registers). This source becomes the -number zero therefore: - External interrupt 0 = interrupt number 0 - External interrupt 1 = interrupt number 1 - External interrupt 2 = interrupt number 2 - ... -Every interrupt number allocates 0x20 bytes register space. So to get -its number it is sufficient to shift the lower 16bits to right by five. -So for the external interrupt 10 we have: - 0x0140 >> 5 = 10 - -After the external sources, the internal sources follow. The in core I2C -controller on the MPC8544 for instance has the internal source number -27. Oo obtain its interrupt number we take the lower 16bits of its memory -address (0x5_0560) and shift it right: - 0x0560 >> 5 = 43 - -Therefore the I2C device node for the MPC8544 CPU has to have the -interrupt number 43 specified in the device tree. - -[0] MPC8544E PowerQUICCTM III, Integrated Host Processor Family Reference Manual - MPC8544ERM Rev. 1 10/2007 diff --git a/Documentation/powerpc/dts-bindings/fsl/msi-pic.txt b/Documentation/powerpc/dts-bindings/fsl/msi-pic.txt deleted file mode 100644 index bcc30bac6831..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/msi-pic.txt +++ /dev/null @@ -1,36 +0,0 @@ -* Freescale MSI interrupt controller - -Required properties: -- compatible : compatible list, contains 2 entries, - first is "fsl,CHIP-msi", where CHIP is the processor(mpc8610, mpc8572, - etc.) and the second is "fsl,mpic-msi" or "fsl,ipic-msi" depending on - the parent type. -- reg : should contain the address and the length of the shared message - interrupt register set. -- msi-available-ranges: use style section to define which - msi interrupt can be used in the 256 msi interrupts. This property is - optional, without this, all the 256 MSI interrupts can be used. -- interrupts : each one of the interrupts here is one entry per 32 MSIs, - and routed to the host interrupt controller. the interrupts should - be set as edge sensitive. -- interrupt-parent: the phandle for the interrupt controller - that services interrupts for this device. for 83xx cpu, the interrupts - are routed to IPIC, and for 85xx/86xx cpu the interrupts are routed - to MPIC. - -Example: - msi@41600 { - compatible = "fsl,mpc8610-msi", "fsl,mpic-msi"; - reg = <0x41600 0x80>; - msi-available-ranges = <0 0x100>; - interrupts = < - 0xe0 0 - 0xe1 0 - 0xe2 0 - 0xe3 0 - 0xe4 0 - 0xe5 0 - 0xe6 0 - 0xe7 0>; - interrupt-parent = <&mpic>; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/pmc.txt b/Documentation/powerpc/dts-bindings/fsl/pmc.txt deleted file mode 100644 index 07256b7ffcaa..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/pmc.txt +++ /dev/null @@ -1,63 +0,0 @@ -* Power Management Controller - -Properties: -- compatible: "fsl,-pmc". - - "fsl,mpc8349-pmc" should be listed for any chip whose PMC is - compatible. "fsl,mpc8313-pmc" should also be listed for any chip - whose PMC is compatible, and implies deep-sleep capability. - - "fsl,mpc8548-pmc" should be listed for any chip whose PMC is - compatible. "fsl,mpc8536-pmc" should also be listed for any chip - whose PMC is compatible, and implies deep-sleep capability. - - "fsl,mpc8641d-pmc" should be listed for any chip whose PMC is - compatible; all statements below that apply to "fsl,mpc8548-pmc" also - apply to "fsl,mpc8641d-pmc". - - Compatibility does not include bit assignments in SCCR/PMCDR/DEVDISR; these - bit assignments are indicated via the sleep specifier in each device's - sleep property. - -- reg: For devices compatible with "fsl,mpc8349-pmc", the first resource - is the PMC block, and the second resource is the Clock Configuration - block. - - For devices compatible with "fsl,mpc8548-pmc", the first resource - is a 32-byte block beginning with DEVDISR. - -- interrupts: For "fsl,mpc8349-pmc"-compatible devices, the first - resource is the PMC block interrupt. - -- fsl,mpc8313-wakeup-timer: For "fsl,mpc8313-pmc"-compatible devices, - this is a phandle to an "fsl,gtm" node on which timer 4 can be used as - a wakeup source from deep sleep. - -Sleep specifiers: - - fsl,mpc8349-pmc: Sleep specifiers consist of one cell. For each bit - that is set in the cell, the corresponding bit in SCCR will be saved - and cleared on suspend, and restored on resume. This sleep controller - supports disabling and resuming devices at any time. - - fsl,mpc8536-pmc: Sleep specifiers consist of three cells, the third of - which will be ORed into PMCDR upon suspend, and cleared from PMCDR - upon resume. The first two cells are as described for fsl,mpc8578-pmc. - This sleep controller only supports disabling devices during system - sleep, or permanently. - - fsl,mpc8548-pmc: Sleep specifiers consist of one or two cells, the - first of which will be ORed into DEVDISR (and the second into - DEVDISR2, if present -- this cell should be zero or absent if the - hardware does not have DEVDISR2) upon a request for permanent device - disabling. This sleep controller does not support configuring devices - to disable during system sleep (unless supported by another compatible - match), or dynamically. - -Example: - - power@b00 { - compatible = "fsl,mpc8313-pmc", "fsl,mpc8349-pmc"; - reg = <0xb00 0x100 0xa00 0x100>; - interrupts = <80 8>; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/sata.txt b/Documentation/powerpc/dts-bindings/fsl/sata.txt deleted file mode 100644 index b46bcf46c3d8..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/sata.txt +++ /dev/null @@ -1,29 +0,0 @@ -* Freescale 8xxx/3.0 Gb/s SATA nodes - -SATA nodes are defined to describe on-chip Serial ATA controllers. -Each SATA port should have its own node. - -Required properties: -- compatible : compatible list, contains 2 entries, first is - "fsl,CHIP-sata", where CHIP is the processor - (mpc8315, mpc8379, etc.) and the second is - "fsl,pq-sata" -- interrupts : -- cell-index : controller index. - 1 for controller @ 0x18000 - 2 for controller @ 0x19000 - 3 for controller @ 0x1a000 - 4 for controller @ 0x1b000 - -Optional properties: -- interrupt-parent : optional, if needed for interrupt mapping -- reg : - -Example: - sata@18000 { - compatible = "fsl,mpc8379-sata", "fsl,pq-sata"; - reg = <0x18000 0x1000>; - cell-index = <1>; - interrupts = <2c 8>; - interrupt-parent = < &ipic >; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/sec.txt b/Documentation/powerpc/dts-bindings/fsl/sec.txt deleted file mode 100644 index 2b6f2d45c45a..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/sec.txt +++ /dev/null @@ -1,68 +0,0 @@ -Freescale SoC SEC Security Engines - -Required properties: - -- compatible : Should contain entries for this and backward compatible - SEC versions, high to low, e.g., "fsl,sec2.1", "fsl,sec2.0" -- reg : Offset and length of the register set for the device -- interrupts : the SEC's interrupt number -- fsl,num-channels : An integer representing the number of channels - available. -- fsl,channel-fifo-len : An integer representing the number of - descriptor pointers each channel fetch fifo can hold. -- fsl,exec-units-mask : The bitmask representing what execution units - (EUs) are available. It's a single 32-bit cell. EU information - should be encoded following the SEC's Descriptor Header Dword - EU_SEL0 field documentation, i.e. as follows: - - bit 0 = reserved - should be 0 - bit 1 = set if SEC has the ARC4 EU (AFEU) - bit 2 = set if SEC has the DES/3DES EU (DEU) - bit 3 = set if SEC has the message digest EU (MDEU/MDEU-A) - bit 4 = set if SEC has the random number generator EU (RNG) - bit 5 = set if SEC has the public key EU (PKEU) - bit 6 = set if SEC has the AES EU (AESU) - bit 7 = set if SEC has the Kasumi EU (KEU) - bit 8 = set if SEC has the CRC EU (CRCU) - bit 11 = set if SEC has the message digest EU extended alg set (MDEU-B) - -remaining bits are reserved for future SEC EUs. - -- fsl,descriptor-types-mask : The bitmask representing what descriptors - are available. It's a single 32-bit cell. Descriptor type information - should be encoded following the SEC's Descriptor Header Dword DESC_TYPE - field documentation, i.e. as follows: - - bit 0 = set if SEC supports the aesu_ctr_nonsnoop desc. type - bit 1 = set if SEC supports the ipsec_esp descriptor type - bit 2 = set if SEC supports the common_nonsnoop desc. type - bit 3 = set if SEC supports the 802.11i AES ccmp desc. type - bit 4 = set if SEC supports the hmac_snoop_no_afeu desc. type - bit 5 = set if SEC supports the srtp descriptor type - bit 6 = set if SEC supports the non_hmac_snoop_no_afeu desc.type - bit 7 = set if SEC supports the pkeu_assemble descriptor type - bit 8 = set if SEC supports the aesu_key_expand_output desc.type - bit 9 = set if SEC supports the pkeu_ptmul descriptor type - bit 10 = set if SEC supports the common_nonsnoop_afeu desc. type - bit 11 = set if SEC supports the pkeu_ptadd_dbl descriptor type - - ..and so on and so forth. - -Optional properties: - -- interrupt-parent : the phandle for the interrupt controller that - services interrupts for this device. - -Example: - - /* MPC8548E */ - crypto@30000 { - compatible = "fsl,sec2.1", "fsl,sec2.0"; - reg = <0x30000 0x10000>; - interrupts = <29 2>; - interrupt-parent = <&mpic>; - fsl,num-channels = <4>; - fsl,channel-fifo-len = <24>; - fsl,exec-units-mask = <0xfe>; - fsl,descriptor-types-mask = <0x12b0ebf>; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/spi.txt b/Documentation/powerpc/dts-bindings/fsl/spi.txt deleted file mode 100644 index 777abd7399d5..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/spi.txt +++ /dev/null @@ -1,53 +0,0 @@ -* SPI (Serial Peripheral Interface) - -Required properties: -- cell-index : QE SPI subblock index. - 0: QE subblock SPI1 - 1: QE subblock SPI2 -- compatible : should be "fsl,spi". -- mode : the SPI operation mode, it can be "cpu" or "cpu-qe". -- reg : Offset and length of the register set for the device -- interrupts : where a is the interrupt number and b is a - field that represents an encoding of the sense and level - information for the interrupt. This should be encoded based on - the information in section 2) depending on the type of interrupt - controller you have. -- interrupt-parent : the phandle for the interrupt controller that - services interrupts for this device. - -Optional properties: -- gpios : specifies the gpio pins to be used for chipselects. - The gpios will be referred to as reg = in the SPI child nodes. - If unspecified, a single SPI device without a chip select can be used. - -Example: - spi@4c0 { - cell-index = <0>; - compatible = "fsl,spi"; - reg = <4c0 40>; - interrupts = <82 0>; - interrupt-parent = <700>; - mode = "cpu"; - gpios = <&gpio 18 1 // device reg=<0> - &gpio 19 1>; // device reg=<1> - }; - - -* eSPI (Enhanced Serial Peripheral Interface) - -Required properties: -- compatible : should be "fsl,mpc8536-espi". -- reg : Offset and length of the register set for the device. -- interrupts : should contain eSPI interrupt, the device has one interrupt. -- fsl,espi-num-chipselects : the number of the chipselect signals. - -Example: - spi@110000 { - #address-cells = <1>; - #size-cells = <0>; - compatible = "fsl,mpc8536-espi"; - reg = <0x110000 0x1000>; - interrupts = <53 0x2>; - interrupt-parent = <&mpic>; - fsl,espi-num-chipselects = <4>; - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/ssi.txt b/Documentation/powerpc/dts-bindings/fsl/ssi.txt deleted file mode 100644 index 5ff76c9c57d2..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/ssi.txt +++ /dev/null @@ -1,73 +0,0 @@ -Freescale Synchronous Serial Interface - -The SSI is a serial device that communicates with audio codecs. It can -be programmed in AC97, I2S, left-justified, or right-justified modes. - -Required properties: -- compatible: Compatible list, contains "fsl,ssi". -- cell-index: The SSI, <0> = SSI1, <1> = SSI2, and so on. -- reg: Offset and length of the register set for the device. -- interrupts: where a is the interrupt number and b is a - field that represents an encoding of the sense and - level information for the interrupt. This should be - encoded based on the information in section 2) - depending on the type of interrupt controller you - have. -- interrupt-parent: The phandle for the interrupt controller that - services interrupts for this device. -- fsl,mode: The operating mode for the SSI interface. - "i2s-slave" - I2S mode, SSI is clock slave - "i2s-master" - I2S mode, SSI is clock master - "lj-slave" - left-justified mode, SSI is clock slave - "lj-master" - l.j. mode, SSI is clock master - "rj-slave" - right-justified mode, SSI is clock slave - "rj-master" - r.j., SSI is clock master - "ac97-slave" - AC97 mode, SSI is clock slave - "ac97-master" - AC97 mode, SSI is clock master -- fsl,playback-dma: Phandle to a node for the DMA channel to use for - playback of audio. This is typically dictated by SOC - design. See the notes below. -- fsl,capture-dma: Phandle to a node for the DMA channel to use for - capture (recording) of audio. This is typically dictated - by SOC design. See the notes below. -- fsl,fifo-depth: The number of elements in the transmit and receive FIFOs. - This number is the maximum allowed value for SFCSR[TFWM0]. -- fsl,ssi-asynchronous: - If specified, the SSI is to be programmed in asynchronous - mode. In this mode, pins SRCK, STCK, SRFS, and STFS must - all be connected to valid signals. In synchronous mode, - SRCK and SRFS are ignored. Asynchronous mode allows - playback and capture to use different sample sizes and - sample rates. Some drivers may require that SRCK and STCK - be connected together, and SRFS and STFS be connected - together. This would still allow different sample sizes, - but not different sample rates. - -Optional properties: -- codec-handle: Phandle to a 'codec' node that defines an audio - codec connected to this SSI. This node is typically - a child of an I2C or other control node. - -Child 'codec' node required properties: -- compatible: Compatible list, contains the name of the codec - -Child 'codec' node optional properties: -- clock-frequency: The frequency of the input clock, which typically comes - from an on-board dedicated oscillator. - -Notes on fsl,playback-dma and fsl,capture-dma: - -On SOCs that have an SSI, specific DMA channels are hard-wired for playback -and capture. On the MPC8610, for example, SSI1 must use DMA channel 0 for -playback and DMA channel 1 for capture. SSI2 must use DMA channel 2 for -playback and DMA channel 3 for capture. The developer can choose which -DMA controller to use, but the channels themselves are hard-wired. The -purpose of these two properties is to represent this hardware design. - -The device tree nodes for the DMA channels that are referenced by -"fsl,playback-dma" and "fsl,capture-dma" must be marked as compatible with -"fsl,ssi-dma-channel". The SOC-specific compatible string (e.g. -"fsl,mpc8610-dma-channel") can remain. If these nodes are left as -"fsl,elo-dma-channel" or "fsl,eloplus-dma-channel", then the generic Elo DMA -drivers (fsldma) will attempt to use them, and it will conflict with the -sound drivers. diff --git a/Documentation/powerpc/dts-bindings/fsl/tsec.txt b/Documentation/powerpc/dts-bindings/fsl/tsec.txt deleted file mode 100644 index edb7ae19e868..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/tsec.txt +++ /dev/null @@ -1,76 +0,0 @@ -* MDIO IO device - -The MDIO is a bus to which the PHY devices are connected. For each -device that exists on this bus, a child node should be created. See -the definition of the PHY node in booting-without-of.txt for an example -of how to define a PHY. - -Required properties: - - reg : Offset and length of the register set for the device - - compatible : Should define the compatible device type for the - mdio. Currently, this is most likely to be "fsl,gianfar-mdio" - -Example: - - mdio@24520 { - reg = <24520 20>; - compatible = "fsl,gianfar-mdio"; - - ethernet-phy@0 { - ...... - }; - }; - -* TBI Internal MDIO bus - -As of this writing, every tsec is associated with an internal TBI PHY. -This PHY is accessed through the local MDIO bus. These buses are defined -similarly to the mdio buses, except they are compatible with "fsl,gianfar-tbi". -The TBI PHYs underneath them are similar to normal PHYs, but the reg property -is considered instructive, rather than descriptive. The reg property should -be chosen so it doesn't interfere with other PHYs on the bus. - -* Gianfar-compatible ethernet nodes - -Properties: - - - device_type : Should be "network" - - model : Model of the device. Can be "TSEC", "eTSEC", or "FEC" - - compatible : Should be "gianfar" - - reg : Offset and length of the register set for the device - - local-mac-address : List of bytes representing the ethernet address of - this controller - - interrupts : For FEC devices, the first interrupt is the device's - interrupt. For TSEC and eTSEC devices, the first interrupt is - transmit, the second is receive, and the third is error. - - phy-handle : The phandle for the PHY connected to this ethernet - controller. - - fixed-link : where a is emulated phy id - choose any, - but unique to the all specified fixed-links, b is duplex - 0 half, - 1 full, c is link speed - d#10/d#100/d#1000, d is pause - 0 no - pause, 1 pause, e is asym_pause - 0 no asym_pause, 1 asym_pause. - - phy-connection-type : a string naming the controller/PHY interface type, - i.e., "mii" (default), "rmii", "gmii", "rgmii", "rgmii-id", "sgmii", - "tbi", or "rtbi". This property is only really needed if the connection - is of type "rgmii-id", as all other connection types are detected by - hardware. - - fsl,magic-packet : If present, indicates that the hardware supports - waking up via magic packet. - - bd-stash : If present, indicates that the hardware supports stashing - buffer descriptors in the L2. - - rx-stash-len : Denotes the number of bytes of a received buffer to stash - in the L2. - - rx-stash-idx : Denotes the index of the first byte from the received - buffer to stash in the L2. - -Example: - ethernet@24000 { - device_type = "network"; - model = "TSEC"; - compatible = "gianfar"; - reg = <0x24000 0x1000>; - local-mac-address = [ 00 E0 0C 00 73 00 ]; - interrupts = <29 2 30 2 34 2>; - interrupt-parent = <&mpic>; - phy-handle = <&phy0> - }; diff --git a/Documentation/powerpc/dts-bindings/fsl/upm-nand.txt b/Documentation/powerpc/dts-bindings/fsl/upm-nand.txt deleted file mode 100644 index a48b2cadc7f0..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/upm-nand.txt +++ /dev/null @@ -1,63 +0,0 @@ -Freescale Localbus UPM programmed to work with NAND flash - -Required properties: -- compatible : "fsl,upm-nand". -- reg : should specify localbus chip select and size used for the chip. -- fsl,upm-addr-offset : UPM pattern offset for the address latch. -- fsl,upm-cmd-offset : UPM pattern offset for the command latch. - -Optional properties: -- fsl,upm-wait-flags : add chip-dependent short delays after running the - UPM pattern (0x1), after writing a data byte (0x2) or after - writing out a buffer (0x4). -- fsl,upm-addr-line-cs-offsets : address offsets for multi-chip support. - The corresponding address lines are used to select the chip. -- gpios : may specify optional GPIOs connected to the Ready-Not-Busy pins - (R/B#). For multi-chip devices, "n" GPIO definitions are required - according to the number of chips. -- chip-delay : chip dependent delay for transfering data from array to - read registers (tR). Required if property "gpios" is not used - (R/B# pins not connected). - -Examples: - -upm@1,0 { - compatible = "fsl,upm-nand"; - reg = <1 0 1>; - fsl,upm-addr-offset = <16>; - fsl,upm-cmd-offset = <8>; - gpios = <&qe_pio_e 18 0>; - - flash { - #address-cells = <1>; - #size-cells = <1>; - compatible = "..."; - - partition@0 { - ... - }; - }; -}; - -upm@3,0 { - #address-cells = <0>; - #size-cells = <0>; - compatible = "tqc,tqm8548-upm-nand", "fsl,upm-nand"; - reg = <3 0x0 0x800>; - fsl,upm-addr-offset = <0x10>; - fsl,upm-cmd-offset = <0x08>; - /* Multi-chip NAND device */ - fsl,upm-addr-line-cs-offsets = <0x0 0x200>; - fsl,upm-wait-flags = <0x5>; - chip-delay = <25>; // in micro-seconds - - nand@0 { - #address-cells = <1>; - #size-cells = <1>; - - partition@0 { - label = "fs"; - reg = <0x00000000 0x10000000>; - }; - }; -}; diff --git a/Documentation/powerpc/dts-bindings/fsl/usb.txt b/Documentation/powerpc/dts-bindings/fsl/usb.txt deleted file mode 100644 index bd5723f0b67e..000000000000 --- a/Documentation/powerpc/dts-bindings/fsl/usb.txt +++ /dev/null @@ -1,81 +0,0 @@ -Freescale SOC USB controllers - -The device node for a USB controller that is part of a Freescale -SOC is as described in the document "Open Firmware Recommended -Practice : Universal Serial Bus" with the following modifications -and additions : - -Required properties : - - compatible : Should be "fsl-usb2-mph" for multi port host USB - controllers, or "fsl-usb2-dr" for dual role USB controllers - or "fsl,mpc5121-usb2-dr" for dual role USB controllers of MPC5121 - - phy_type : For multi port host USB controllers, should be one of - "ulpi", or "serial". For dual role USB controllers, should be - one of "ulpi", "utmi", "utmi_wide", or "serial". - - reg : Offset and length of the register set for the device - - port0 : boolean; if defined, indicates port0 is connected for - fsl-usb2-mph compatible controllers. Either this property or - "port1" (or both) must be defined for "fsl-usb2-mph" compatible - controllers. - - port1 : boolean; if defined, indicates port1 is connected for - fsl-usb2-mph compatible controllers. Either this property or - "port0" (or both) must be defined for "fsl-usb2-mph" compatible - controllers. - - dr_mode : indicates the working mode for "fsl-usb2-dr" compatible - controllers. Can be "host", "peripheral", or "otg". Default to - "host" if not defined for backward compatibility. - -Recommended properties : - - interrupts : where a is the interrupt number and b is a - field that represents an encoding of the sense and level - information for the interrupt. This should be encoded based on - the information in section 2) depending on the type of interrupt - controller you have. - - interrupt-parent : the phandle for the interrupt controller that - services interrupts for this device. - -Optional properties : - - fsl,invert-drvvbus : boolean; for MPC5121 USB0 only. Indicates the - port power polarity of internal PHY signal DRVVBUS is inverted. - - fsl,invert-pwr-fault : boolean; for MPC5121 USB0 only. Indicates - the PWR_FAULT signal polarity is inverted. - -Example multi port host USB controller device node : - usb@22000 { - compatible = "fsl-usb2-mph"; - reg = <22000 1000>; - #address-cells = <1>; - #size-cells = <0>; - interrupt-parent = <700>; - interrupts = <27 1>; - phy_type = "ulpi"; - port0; - port1; - }; - -Example dual role USB controller device node : - usb@23000 { - compatible = "fsl-usb2-dr"; - reg = <23000 1000>; - #address-cells = <1>; - #size-cells = <0>; - interrupt-parent = <700>; - interrupts = <26 1>; - dr_mode = "otg"; - phy = "ulpi"; - }; - -Example dual role USB controller device node for MPC5121ADS: - - usb@4000 { - compatible = "fsl,mpc5121-usb2-dr"; - reg = <0x4000 0x1000>; - #address-cells = <1>; - #size-cells = <0>; - interrupt-parent = < &ipic >; - interrupts = <44 0x8>; - dr_mode = "otg"; - phy_type = "utmi_wide"; - fsl,invert-drvvbus; - fsl,invert-pwr-fault; - }; diff --git a/Documentation/powerpc/dts-bindings/gpio/gpio.txt b/Documentation/powerpc/dts-bindings/gpio/gpio.txt deleted file mode 100644 index edaa84d288a1..000000000000 --- a/Documentation/powerpc/dts-bindings/gpio/gpio.txt +++ /dev/null @@ -1,50 +0,0 @@ -Specifying GPIO information for devices -============================================ - -1) gpios property ------------------ - -Nodes that makes use of GPIOs should define them using `gpios' property, -format of which is: <&gpio-controller1-phandle gpio1-specifier - &gpio-controller2-phandle gpio2-specifier - 0 /* holes are permitted, means no GPIO 3 */ - &gpio-controller4-phandle gpio4-specifier - ...>; - -Note that gpio-specifier length is controller dependent. - -gpio-specifier may encode: bank, pin position inside the bank, -whether pin is open-drain and whether pin is logically inverted. - -Example of the node using GPIOs: - - node { - gpios = <&qe_pio_e 18 0>; - }; - -In this example gpio-specifier is "18 0" and encodes GPIO pin number, -and empty GPIO flags as accepted by the "qe_pio_e" gpio-controller. - -2) gpio-controller nodes ------------------------- - -Every GPIO controller node must have #gpio-cells property defined, -this information will be used to translate gpio-specifiers. - -Example of two SOC GPIO banks defined as gpio-controller nodes: - - qe_pio_a: gpio-controller@1400 { - #gpio-cells = <2>; - compatible = "fsl,qe-pario-bank-a", "fsl,qe-pario-bank"; - reg = <0x1400 0x18>; - gpio-controller; - }; - - qe_pio_e: gpio-controller@1460 { - #gpio-cells = <2>; - compatible = "fsl,qe-pario-bank-e", "fsl,qe-pario-bank"; - reg = <0x1460 0x18>; - gpio-controller; - }; - - diff --git a/Documentation/powerpc/dts-bindings/gpio/led.txt b/Documentation/powerpc/dts-bindings/gpio/led.txt deleted file mode 100644 index 064db928c3c1..000000000000 --- a/Documentation/powerpc/dts-bindings/gpio/led.txt +++ /dev/null @@ -1,58 +0,0 @@ -LEDs connected to GPIO lines - -Required properties: -- compatible : should be "gpio-leds". - -Each LED is represented as a sub-node of the gpio-leds device. Each -node's name represents the name of the corresponding LED. - -LED sub-node properties: -- gpios : Should specify the LED's GPIO, see "Specifying GPIO information - for devices" in Documentation/powerpc/booting-without-of.txt. Active - low LEDs should be indicated using flags in the GPIO specifier. -- label : (optional) The label for this LED. If omitted, the label is - taken from the node name (excluding the unit address). -- linux,default-trigger : (optional) This parameter, if present, is a - string defining the trigger assigned to the LED. Current triggers are: - "backlight" - LED will act as a back-light, controlled by the framebuffer - system - "default-on" - LED will turn on, but see "default-state" below - "heartbeat" - LED "double" flashes at a load average based rate - "ide-disk" - LED indicates disk activity - "timer" - LED flashes at a fixed, configurable rate -- default-state: (optional) The initial state of the LED. Valid - values are "on", "off", and "keep". If the LED is already on or off - and the default-state property is set the to same value, then no - glitch should be produced where the LED momentarily turns off (or - on). The "keep" setting will keep the LED at whatever its current - state is, without producing a glitch. The default is off if this - property is not present. - -Examples: - -leds { - compatible = "gpio-leds"; - hdd { - label = "IDE Activity"; - gpios = <&mcu_pio 0 1>; /* Active low */ - linux,default-trigger = "ide-disk"; - }; - - fault { - gpios = <&mcu_pio 1 0>; - /* Keep LED on if BIOS detected hardware fault */ - default-state = "keep"; - }; -}; - -run-control { - compatible = "gpio-leds"; - red { - gpios = <&mpc8572 6 0>; - default-state = "off"; - }; - green { - gpios = <&mpc8572 7 0>; - default-state = "on"; - }; -} diff --git a/Documentation/powerpc/dts-bindings/gpio/mdio.txt b/Documentation/powerpc/dts-bindings/gpio/mdio.txt deleted file mode 100644 index bc9549529014..000000000000 --- a/Documentation/powerpc/dts-bindings/gpio/mdio.txt +++ /dev/null @@ -1,19 +0,0 @@ -MDIO on GPIOs - -Currently defined compatibles: -- virtual,gpio-mdio - -MDC and MDIO lines connected to GPIO controllers are listed in the -gpios property as described in section VIII.1 in the following order: - -MDC, MDIO. - -Example: - -mdio { - compatible = "virtual,mdio-gpio"; - #address-cells = <1>; - #size-cells = <0>; - gpios = <&qe_pio_a 11 - &qe_pio_c 6>; -}; diff --git a/Documentation/powerpc/dts-bindings/marvell.txt b/Documentation/powerpc/dts-bindings/marvell.txt deleted file mode 100644 index f1533d91953a..000000000000 --- a/Documentation/powerpc/dts-bindings/marvell.txt +++ /dev/null @@ -1,521 +0,0 @@ -Marvell Discovery mv64[345]6x System Controller chips -=========================================================== - -The Marvell mv64[345]60 series of system controller chips contain -many of the peripherals needed to implement a complete computer -system. In this section, we define device tree nodes to describe -the system controller chip itself and each of the peripherals -which it contains. Compatible string values for each node are -prefixed with the string "marvell,", for Marvell Technology Group Ltd. - -1) The /system-controller node - - This node is used to represent the system-controller and must be - present when the system uses a system controller chip. The top-level - system-controller node contains information that is global to all - devices within the system controller chip. The node name begins - with "system-controller" followed by the unit address, which is - the base address of the memory-mapped register set for the system - controller chip. - - Required properties: - - - ranges : Describes the translation of system controller addresses - for memory mapped registers. - - clock-frequency: Contains the main clock frequency for the system - controller chip. - - reg : This property defines the address and size of the - memory-mapped registers contained within the system controller - chip. The address specified in the "reg" property should match - the unit address of the system-controller node. - - #address-cells : Address representation for system controller - devices. This field represents the number of cells needed to - represent the address of the memory-mapped registers of devices - within the system controller chip. - - #size-cells : Size representation for the memory-mapped - registers within the system controller chip. - - #interrupt-cells : Defines the width of cells used to represent - interrupts. - - Optional properties: - - - model : The specific model of the system controller chip. Such - as, "mv64360", "mv64460", or "mv64560". - - compatible : A string identifying the compatibility identifiers - of the system controller chip. - - The system-controller node contains child nodes for each system - controller device that the platform uses. Nodes should not be created - for devices which exist on the system controller chip but are not used - - Example Marvell Discovery mv64360 system-controller node: - - system-controller@f1000000 { /* Marvell Discovery mv64360 */ - #address-cells = <1>; - #size-cells = <1>; - model = "mv64360"; /* Default */ - compatible = "marvell,mv64360"; - clock-frequency = <133333333>; - reg = <0xf1000000 0x10000>; - virtual-reg = <0xf1000000>; - ranges = <0x88000000 0x88000000 0x1000000 /* PCI 0 I/O Space */ - 0x80000000 0x80000000 0x8000000 /* PCI 0 MEM Space */ - 0xa0000000 0xa0000000 0x4000000 /* User FLASH */ - 0x00000000 0xf1000000 0x0010000 /* Bridge's regs */ - 0xf2000000 0xf2000000 0x0040000>;/* Integrated SRAM */ - - [ child node definitions... ] - } - -2) Child nodes of /system-controller - - a) Marvell Discovery MDIO bus - - The MDIO is a bus to which the PHY devices are connected. For each - device that exists on this bus, a child node should be created. See - the definition of the PHY node below for an example of how to define - a PHY. - - Required properties: - - #address-cells : Should be <1> - - #size-cells : Should be <0> - - device_type : Should be "mdio" - - compatible : Should be "marvell,mv64360-mdio" - - Example: - - mdio { - #address-cells = <1>; - #size-cells = <0>; - device_type = "mdio"; - compatible = "marvell,mv64360-mdio"; - - ethernet-phy@0 { - ...... - }; - }; - - - b) Marvell Discovery ethernet controller - - The Discover ethernet controller is described with two levels - of nodes. The first level describes an ethernet silicon block - and the second level describes up to 3 ethernet nodes within - that block. The reason for the multiple levels is that the - registers for the node are interleaved within a single set - of registers. The "ethernet-block" level describes the - shared register set, and the "ethernet" nodes describe ethernet - port-specific properties. - - Ethernet block node - - Required properties: - - #address-cells : <1> - - #size-cells : <0> - - compatible : "marvell,mv64360-eth-block" - - reg : Offset and length of the register set for this block - - Example Discovery Ethernet block node: - ethernet-block@2000 { - #address-cells = <1>; - #size-cells = <0>; - compatible = "marvell,mv64360-eth-block"; - reg = <0x2000 0x2000>; - ethernet@0 { - ....... - }; - }; - - Ethernet port node - - Required properties: - - device_type : Should be "network". - - compatible : Should be "marvell,mv64360-eth". - - reg : Should be <0>, <1>, or <2>, according to which registers - within the silicon block the device uses. - - interrupts : where a is the interrupt number for the port. - - interrupt-parent : the phandle for the interrupt controller - that services interrupts for this device. - - phy : the phandle for the PHY connected to this ethernet - controller. - - local-mac-address : 6 bytes, MAC address - - Example Discovery Ethernet port node: - ethernet@0 { - device_type = "network"; - compatible = "marvell,mv64360-eth"; - reg = <0>; - interrupts = <32>; - interrupt-parent = <&PIC>; - phy = <&PHY0>; - local-mac-address = [ 00 00 00 00 00 00 ]; - }; - - - - c) Marvell Discovery PHY nodes - - Required properties: - - device_type : Should be "ethernet-phy" - - interrupts : where a is the interrupt number for this phy. - - interrupt-parent : the phandle for the interrupt controller that - services interrupts for this device. - - reg : The ID number for the phy, usually a small integer - - Example Discovery PHY node: - ethernet-phy@1 { - device_type = "ethernet-phy"; - compatible = "broadcom,bcm5421"; - interrupts = <76>; /* GPP 12 */ - interrupt-parent = <&PIC>; - reg = <1>; - }; - - - d) Marvell Discovery SDMA nodes - - Represent DMA hardware associated with the MPSC (multiprotocol - serial controllers). - - Required properties: - - compatible : "marvell,mv64360-sdma" - - reg : Offset and length of the register set for this device - - interrupts : where a is the interrupt number for the DMA - device. - - interrupt-parent : the phandle for the interrupt controller - that services interrupts for this device. - - Example Discovery SDMA node: - sdma@4000 { - compatible = "marvell,mv64360-sdma"; - reg = <0x4000 0xc18>; - virtual-reg = <0xf1004000>; - interrupts = <36>; - interrupt-parent = <&PIC>; - }; - - - e) Marvell Discovery BRG nodes - - Represent baud rate generator hardware associated with the MPSC - (multiprotocol serial controllers). - - Required properties: - - compatible : "marvell,mv64360-brg" - - reg : Offset and length of the register set for this device - - clock-src : A value from 0 to 15 which selects the clock - source for the baud rate generator. This value corresponds - to the CLKS value in the BRGx configuration register. See - the mv64x60 User's Manual. - - clock-frequence : The frequency (in Hz) of the baud rate - generator's input clock. - - current-speed : The current speed setting (presumably by - firmware) of the baud rate generator. - - Example Discovery BRG node: - brg@b200 { - compatible = "marvell,mv64360-brg"; - reg = <0xb200 0x8>; - clock-src = <8>; - clock-frequency = <133333333>; - current-speed = <9600>; - }; - - - f) Marvell Discovery CUNIT nodes - - Represent the Serial Communications Unit device hardware. - - Required properties: - - reg : Offset and length of the register set for this device - - Example Discovery CUNIT node: - cunit@f200 { - reg = <0xf200 0x200>; - }; - - - g) Marvell Discovery MPSCROUTING nodes - - Represent the Discovery's MPSC routing hardware - - Required properties: - - reg : Offset and length of the register set for this device - - Example Discovery CUNIT node: - mpscrouting@b500 { - reg = <0xb400 0xc>; - }; - - - h) Marvell Discovery MPSCINTR nodes - - Represent the Discovery's MPSC DMA interrupt hardware registers - (SDMA cause and mask registers). - - Required properties: - - reg : Offset and length of the register set for this device - - Example Discovery MPSCINTR node: - mpsintr@b800 { - reg = <0xb800 0x100>; - }; - - - i) Marvell Discovery MPSC nodes - - Represent the Discovery's MPSC (Multiprotocol Serial Controller) - serial port. - - Required properties: - - device_type : "serial" - - compatible : "marvell,mv64360-mpsc" - - reg : Offset and length of the register set for this device - - sdma : the phandle for the SDMA node used by this port - - brg : the phandle for the BRG node used by this port - - cunit : the phandle for the CUNIT node used by this port - - mpscrouting : the phandle for the MPSCROUTING node used by this port - - mpscintr : the phandle for the MPSCINTR node used by this port - - cell-index : the hardware index of this cell in the MPSC core - - max_idle : value needed for MPSC CHR3 (Maximum Frame Length) - register - - interrupts : where a is the interrupt number for the MPSC. - - interrupt-parent : the phandle for the interrupt controller - that services interrupts for this device. - - Example Discovery MPSCINTR node: - mpsc@8000 { - device_type = "serial"; - compatible = "marvell,mv64360-mpsc"; - reg = <0x8000 0x38>; - virtual-reg = <0xf1008000>; - sdma = <&SDMA0>; - brg = <&BRG0>; - cunit = <&CUNIT>; - mpscrouting = <&MPSCROUTING>; - mpscintr = <&MPSCINTR>; - cell-index = <0>; - max_idle = <40>; - interrupts = <40>; - interrupt-parent = <&PIC>; - }; - - - j) Marvell Discovery Watch Dog Timer nodes - - Represent the Discovery's watchdog timer hardware - - Required properties: - - compatible : "marvell,mv64360-wdt" - - reg : Offset and length of the register set for this device - - Example Discovery Watch Dog Timer node: - wdt@b410 { - compatible = "marvell,mv64360-wdt"; - reg = <0xb410 0x8>; - }; - - - k) Marvell Discovery I2C nodes - - Represent the Discovery's I2C hardware - - Required properties: - - device_type : "i2c" - - compatible : "marvell,mv64360-i2c" - - reg : Offset and length of the register set for this device - - interrupts : where a is the interrupt number for the I2C. - - interrupt-parent : the phandle for the interrupt controller - that services interrupts for this device. - - Example Discovery I2C node: - compatible = "marvell,mv64360-i2c"; - reg = <0xc000 0x20>; - virtual-reg = <0xf100c000>; - interrupts = <37>; - interrupt-parent = <&PIC>; - }; - - - l) Marvell Discovery PIC (Programmable Interrupt Controller) nodes - - Represent the Discovery's PIC hardware - - Required properties: - - #interrupt-cells : <1> - - #address-cells : <0> - - compatible : "marvell,mv64360-pic" - - reg : Offset and length of the register set for this device - - interrupt-controller - - Example Discovery PIC node: - pic { - #interrupt-cells = <1>; - #address-cells = <0>; - compatible = "marvell,mv64360-pic"; - reg = <0x0 0x88>; - interrupt-controller; - }; - - - m) Marvell Discovery MPP (Multipurpose Pins) multiplexing nodes - - Represent the Discovery's MPP hardware - - Required properties: - - compatible : "marvell,mv64360-mpp" - - reg : Offset and length of the register set for this device - - Example Discovery MPP node: - mpp@f000 { - compatible = "marvell,mv64360-mpp"; - reg = <0xf000 0x10>; - }; - - - n) Marvell Discovery GPP (General Purpose Pins) nodes - - Represent the Discovery's GPP hardware - - Required properties: - - compatible : "marvell,mv64360-gpp" - - reg : Offset and length of the register set for this device - - Example Discovery GPP node: - gpp@f000 { - compatible = "marvell,mv64360-gpp"; - reg = <0xf100 0x20>; - }; - - - o) Marvell Discovery PCI host bridge node - - Represents the Discovery's PCI host bridge device. The properties - for this node conform to Rev 2.1 of the PCI Bus Binding to IEEE - 1275-1994. A typical value for the compatible property is - "marvell,mv64360-pci". - - Example Discovery PCI host bridge node - pci@80000000 { - #address-cells = <3>; - #size-cells = <2>; - #interrupt-cells = <1>; - device_type = "pci"; - compatible = "marvell,mv64360-pci"; - reg = <0xcf8 0x8>; - ranges = <0x01000000 0x0 0x0 - 0x88000000 0x0 0x01000000 - 0x02000000 0x0 0x80000000 - 0x80000000 0x0 0x08000000>; - bus-range = <0 255>; - clock-frequency = <66000000>; - interrupt-parent = <&PIC>; - interrupt-map-mask = <0xf800 0x0 0x0 0x7>; - interrupt-map = < - /* IDSEL 0x0a */ - 0x5000 0 0 1 &PIC 80 - 0x5000 0 0 2 &PIC 81 - 0x5000 0 0 3 &PIC 91 - 0x5000 0 0 4 &PIC 93 - - /* IDSEL 0x0b */ - 0x5800 0 0 1 &PIC 91 - 0x5800 0 0 2 &PIC 93 - 0x5800 0 0 3 &PIC 80 - 0x5800 0 0 4 &PIC 81 - - /* IDSEL 0x0c */ - 0x6000 0 0 1 &PIC 91 - 0x6000 0 0 2 &PIC 93 - 0x6000 0 0 3 &PIC 80 - 0x6000 0 0 4 &PIC 81 - - /* IDSEL 0x0d */ - 0x6800 0 0 1 &PIC 93 - 0x6800 0 0 2 &PIC 80 - 0x6800 0 0 3 &PIC 81 - 0x6800 0 0 4 &PIC 91 - >; - }; - - - p) Marvell Discovery CPU Error nodes - - Represent the Discovery's CPU error handler device. - - Required properties: - - compatible : "marvell,mv64360-cpu-error" - - reg : Offset and length of the register set for this device - - interrupts : the interrupt number for this device - - interrupt-parent : the phandle for the interrupt controller - that services interrupts for this device. - - Example Discovery CPU Error node: - cpu-error@0070 { - compatible = "marvell,mv64360-cpu-error"; - reg = <0x70 0x10 0x128 0x28>; - interrupts = <3>; - interrupt-parent = <&PIC>; - }; - - - q) Marvell Discovery SRAM Controller nodes - - Represent the Discovery's SRAM controller device. - - Required properties: - - compatible : "marvell,mv64360-sram-ctrl" - - reg : Offset and length of the register set for this device - - interrupts : the interrupt number for this device - - interrupt-parent : the phandle for the interrupt controller - that services interrupts for this device. - - Example Discovery SRAM Controller node: - sram-ctrl@0380 { - compatible = "marvell,mv64360-sram-ctrl"; - reg = <0x380 0x80>; - interrupts = <13>; - interrupt-parent = <&PIC>; - }; - - - r) Marvell Discovery PCI Error Handler nodes - - Represent the Discovery's PCI error handler device. - - Required properties: - - compatible : "marvell,mv64360-pci-error" - - reg : Offset and length of the register set for this device - - interrupts : the interrupt number for this device - - interrupt-parent : the phandle for the interrupt controller - that services interrupts for this device. - - Example Discovery PCI Error Handler node: - pci-error@1d40 { - compatible = "marvell,mv64360-pci-error"; - reg = <0x1d40 0x40 0xc28 0x4>; - interrupts = <12>; - interrupt-parent = <&PIC>; - }; - - - s) Marvell Discovery Memory Controller nodes - - Represent the Discovery's memory controller device. - - Required properties: - - compatible : "marvell,mv64360-mem-ctrl" - - reg : Offset and length of the register set for this device - - interrupts : the interrupt number for this device - - interrupt-parent : the phandle for the interrupt controller - that services interrupts for this device. - - Example Discovery Memory Controller node: - mem-ctrl@1400 { - compatible = "marvell,mv64360-mem-ctrl"; - reg = <0x1400 0x60>; - interrupts = <17>; - interrupt-parent = <&PIC>; - }; - - diff --git a/Documentation/powerpc/dts-bindings/mmc-spi-slot.txt b/Documentation/powerpc/dts-bindings/mmc-spi-slot.txt deleted file mode 100644 index c39ac2891951..000000000000 --- a/Documentation/powerpc/dts-bindings/mmc-spi-slot.txt +++ /dev/null @@ -1,23 +0,0 @@ -MMC/SD/SDIO slot directly connected to a SPI bus - -Required properties: -- compatible : should be "mmc-spi-slot". -- reg : should specify SPI address (chip-select number). -- spi-max-frequency : maximum frequency for this device (Hz). -- voltage-ranges : two cells are required, first cell specifies minimum - slot voltage (mV), second cell specifies maximum slot voltage (mV). - Several ranges could be specified. -- gpios : (optional) may specify GPIOs in this order: Card-Detect GPIO, - Write-Protect GPIO. - -Example: - - mmc-slot@0 { - compatible = "fsl,mpc8323rdb-mmc-slot", - "mmc-spi-slot"; - reg = <0>; - gpios = <&qe_pio_d 14 1 - &qe_pio_d 15 0>; - voltage-ranges = <3300 3300>; - spi-max-frequency = <50000000>; - }; diff --git a/Documentation/powerpc/dts-bindings/mtd-physmap.txt b/Documentation/powerpc/dts-bindings/mtd-physmap.txt deleted file mode 100644 index 80152cb567d9..000000000000 --- a/Documentation/powerpc/dts-bindings/mtd-physmap.txt +++ /dev/null @@ -1,90 +0,0 @@ -CFI or JEDEC memory-mapped NOR flash, MTD-RAM (NVRAM...) - -Flash chips (Memory Technology Devices) are often used for solid state -file systems on embedded devices. - - - compatible : should contain the specific model of mtd chip(s) - used, if known, followed by either "cfi-flash", "jedec-flash" - or "mtd-ram". - - reg : Address range(s) of the mtd chip(s) - It's possible to (optionally) define multiple "reg" tuples so that - non-identical chips can be described in one node. - - bank-width : Width (in bytes) of the bank. Equal to the - device width times the number of interleaved chips. - - device-width : (optional) Width of a single mtd chip. If - omitted, assumed to be equal to 'bank-width'. - - #address-cells, #size-cells : Must be present if the device has - sub-nodes representing partitions (see below). In this case - both #address-cells and #size-cells must be equal to 1. - -For JEDEC compatible devices, the following additional properties -are defined: - - - vendor-id : Contains the flash chip's vendor id (1 byte). - - device-id : Contains the flash chip's device id (1 byte). - -In addition to the information on the mtd bank itself, the -device tree may optionally contain additional information -describing partitions of the address space. This can be -used on platforms which have strong conventions about which -portions of a flash are used for what purposes, but which don't -use an on-flash partition table such as RedBoot. - -Each partition is represented as a sub-node of the mtd device. -Each node's name represents the name of the corresponding -partition of the mtd device. - -Flash partitions - - reg : The partition's offset and size within the mtd bank. - - label : (optional) The label / name for this partition. - If omitted, the label is taken from the node name (excluding - the unit address). - - read-only : (optional) This parameter, if present, is a hint to - Linux that this partition should only be mounted - read-only. This is usually used for flash partitions - containing early-boot firmware images or data which should not - be clobbered. - -Example: - - flash@ff000000 { - compatible = "amd,am29lv128ml", "cfi-flash"; - reg = ; - bank-width = <4>; - device-width = <1>; - #address-cells = <1>; - #size-cells = <1>; - fs@0 { - label = "fs"; - reg = <0 f80000>; - }; - firmware@f80000 { - label ="firmware"; - reg = ; - read-only; - }; - }; - -Here an example with multiple "reg" tuples: - - flash@f0000000,0 { - #address-cells = <1>; - #size-cells = <1>; - compatible = "intel,PC48F4400P0VB", "cfi-flash"; - reg = <0 0x00000000 0x02000000 - 0 0x02000000 0x02000000>; - bank-width = <2>; - partition@0 { - label = "test-part1"; - reg = <0 0x04000000>; - }; - }; - -An example using SRAM: - - sram@2,0 { - compatible = "samsung,k6f1616u6a", "mtd-ram"; - reg = <2 0 0x00200000>; - bank-width = <2>; - }; - diff --git a/Documentation/powerpc/dts-bindings/nintendo/gamecube.txt b/Documentation/powerpc/dts-bindings/nintendo/gamecube.txt deleted file mode 100644 index b558585b1aaf..000000000000 --- a/Documentation/powerpc/dts-bindings/nintendo/gamecube.txt +++ /dev/null @@ -1,109 +0,0 @@ - -Nintendo GameCube device tree -============================= - -1) The "flipper" node - - This node represents the multi-function "Flipper" chip, which packages - many of the devices found in the Nintendo GameCube. - - Required properties: - - - compatible : Should be "nintendo,flipper" - -1.a) The Video Interface (VI) node - - Represents the interface between the graphics processor and a external - video encoder. - - Required properties: - - - compatible : should be "nintendo,flipper-vi" - - reg : should contain the VI registers location and length - - interrupts : should contain the VI interrupt - -1.b) The Processor Interface (PI) node - - Represents the data and control interface between the main processor - and graphics and audio processor. - - Required properties: - - - compatible : should be "nintendo,flipper-pi" - - reg : should contain the PI registers location and length - -1.b.i) The "Flipper" interrupt controller node - - Represents the interrupt controller within the "Flipper" chip. - The node for the "Flipper" interrupt controller must be placed under - the PI node. - - Required properties: - - - compatible : should be "nintendo,flipper-pic" - -1.c) The Digital Signal Procesor (DSP) node - - Represents the digital signal processor interface, designed to offload - audio related tasks. - - Required properties: - - - compatible : should be "nintendo,flipper-dsp" - - reg : should contain the DSP registers location and length - - interrupts : should contain the DSP interrupt - -1.c.i) The Auxiliary RAM (ARAM) node - - Represents the non cpu-addressable ram designed mainly to store audio - related information. - The ARAM node must be placed under the DSP node. - - Required properties: - - - compatible : should be "nintendo,flipper-aram" - - reg : should contain the ARAM start (zero-based) and length - -1.d) The Disk Interface (DI) node - - Represents the interface used to communicate with mass storage devices. - - Required properties: - - - compatible : should be "nintendo,flipper-di" - - reg : should contain the DI registers location and length - - interrupts : should contain the DI interrupt - -1.e) The Audio Interface (AI) node - - Represents the interface to the external 16-bit stereo digital-to-analog - converter. - - Required properties: - - - compatible : should be "nintendo,flipper-ai" - - reg : should contain the AI registers location and length - - interrupts : should contain the AI interrupt - -1.f) The Serial Interface (SI) node - - Represents the interface to the four single bit serial interfaces. - The SI is a proprietary serial interface used normally to control gamepads. - It's NOT a RS232-type interface. - - Required properties: - - - compatible : should be "nintendo,flipper-si" - - reg : should contain the SI registers location and length - - interrupts : should contain the SI interrupt - -1.g) The External Interface (EXI) node - - Represents the multi-channel SPI-like interface. - - Required properties: - - - compatible : should be "nintendo,flipper-exi" - - reg : should contain the EXI registers location and length - - interrupts : should contain the EXI interrupt - diff --git a/Documentation/powerpc/dts-bindings/nintendo/wii.txt b/Documentation/powerpc/dts-bindings/nintendo/wii.txt deleted file mode 100644 index a7e155a023b8..000000000000 --- a/Documentation/powerpc/dts-bindings/nintendo/wii.txt +++ /dev/null @@ -1,184 +0,0 @@ - -Nintendo Wii device tree -======================== - -0) The root node - - This node represents the Nintendo Wii video game console. - - Required properties: - - - model : Should be "nintendo,wii" - - compatible : Should be "nintendo,wii" - -1) The "hollywood" node - - This node represents the multi-function "Hollywood" chip, which packages - many of the devices found in the Nintendo Wii. - - Required properties: - - - compatible : Should be "nintendo,hollywood" - -1.a) The Video Interface (VI) node - - Represents the interface between the graphics processor and a external - video encoder. - - Required properties: - - - compatible : should be "nintendo,hollywood-vi","nintendo,flipper-vi" - - reg : should contain the VI registers location and length - - interrupts : should contain the VI interrupt - -1.b) The Processor Interface (PI) node - - Represents the data and control interface between the main processor - and graphics and audio processor. - - Required properties: - - - compatible : should be "nintendo,hollywood-pi","nintendo,flipper-pi" - - reg : should contain the PI registers location and length - -1.b.i) The "Flipper" interrupt controller node - - Represents the "Flipper" interrupt controller within the "Hollywood" chip. - The node for the "Flipper" interrupt controller must be placed under - the PI node. - - Required properties: - - - #interrupt-cells : <1> - - compatible : should be "nintendo,flipper-pic" - - interrupt-controller - -1.c) The Digital Signal Procesor (DSP) node - - Represents the digital signal processor interface, designed to offload - audio related tasks. - - Required properties: - - - compatible : should be "nintendo,hollywood-dsp","nintendo,flipper-dsp" - - reg : should contain the DSP registers location and length - - interrupts : should contain the DSP interrupt - -1.d) The Serial Interface (SI) node - - Represents the interface to the four single bit serial interfaces. - The SI is a proprietary serial interface used normally to control gamepads. - It's NOT a RS232-type interface. - - Required properties: - - - compatible : should be "nintendo,hollywood-si","nintendo,flipper-si" - - reg : should contain the SI registers location and length - - interrupts : should contain the SI interrupt - -1.e) The Audio Interface (AI) node - - Represents the interface to the external 16-bit stereo digital-to-analog - converter. - - Required properties: - - - compatible : should be "nintendo,hollywood-ai","nintendo,flipper-ai" - - reg : should contain the AI registers location and length - - interrupts : should contain the AI interrupt - -1.f) The External Interface (EXI) node - - Represents the multi-channel SPI-like interface. - - Required properties: - - - compatible : should be "nintendo,hollywood-exi","nintendo,flipper-exi" - - reg : should contain the EXI registers location and length - - interrupts : should contain the EXI interrupt - -1.g) The Open Host Controller Interface (OHCI) nodes - - Represent the USB 1.x Open Host Controller Interfaces. - - Required properties: - - - compatible : should be "nintendo,hollywood-usb-ohci","usb-ohci" - - reg : should contain the OHCI registers location and length - - interrupts : should contain the OHCI interrupt - -1.h) The Enhanced Host Controller Interface (EHCI) node - - Represents the USB 2.0 Enhanced Host Controller Interface. - - Required properties: - - - compatible : should be "nintendo,hollywood-usb-ehci","usb-ehci" - - reg : should contain the EHCI registers location and length - - interrupts : should contain the EHCI interrupt - -1.i) The Secure Digital Host Controller Interface (SDHCI) nodes - - Represent the Secure Digital Host Controller Interfaces. - - Required properties: - - - compatible : should be "nintendo,hollywood-sdhci","sdhci" - - reg : should contain the SDHCI registers location and length - - interrupts : should contain the SDHCI interrupt - -1.j) The Inter-Processsor Communication (IPC) node - - Represent the Inter-Processor Communication interface. This interface - enables communications between the Broadway and the Starlet processors. - - - compatible : should be "nintendo,hollywood-ipc" - - reg : should contain the IPC registers location and length - - interrupts : should contain the IPC interrupt - -1.k) The "Hollywood" interrupt controller node - - Represents the "Hollywood" interrupt controller within the - "Hollywood" chip. - - Required properties: - - - #interrupt-cells : <1> - - compatible : should be "nintendo,hollywood-pic" - - reg : should contain the controller registers location and length - - interrupt-controller - - interrupts : should contain the cascade interrupt of the "flipper" pic - - interrupt-parent: should contain the phandle of the "flipper" pic - -1.l) The General Purpose I/O (GPIO) controller node - - Represents the dual access 32 GPIO controller interface. - - Required properties: - - - #gpio-cells : <2> - - compatible : should be "nintendo,hollywood-gpio" - - reg : should contain the IPC registers location and length - - gpio-controller - -1.m) The control node - - Represents the control interface used to setup several miscellaneous - settings of the "Hollywood" chip like boot memory mappings, resets, - disk interface mode, etc. - - Required properties: - - - compatible : should be "nintendo,hollywood-control" - - reg : should contain the control registers location and length - -1.n) The Disk Interface (DI) node - - Represents the interface used to communicate with mass storage devices. - - Required properties: - - - compatible : should be "nintendo,hollywood-di" - - reg : should contain the DI registers location and length - - interrupts : should contain the DI interrupt - diff --git a/Documentation/powerpc/dts-bindings/phy.txt b/Documentation/powerpc/dts-bindings/phy.txt deleted file mode 100644 index bb8c742eb8c5..000000000000 --- a/Documentation/powerpc/dts-bindings/phy.txt +++ /dev/null @@ -1,25 +0,0 @@ -PHY nodes - -Required properties: - - - device_type : Should be "ethernet-phy" - - interrupts : where a is the interrupt number and b is a - field that represents an encoding of the sense and level - information for the interrupt. This should be encoded based on - the information in section 2) depending on the type of interrupt - controller you have. - - interrupt-parent : the phandle for the interrupt controller that - services interrupts for this device. - - reg : The ID number for the phy, usually a small integer - - linux,phandle : phandle for this node; likely referenced by an - ethernet controller node. - -Example: - -ethernet-phy@0 { - linux,phandle = <2452000> - interrupt-parent = <40000>; - interrupts = <35 1>; - reg = <0>; - device_type = "ethernet-phy"; -}; diff --git a/Documentation/powerpc/dts-bindings/spi-bus.txt b/Documentation/powerpc/dts-bindings/spi-bus.txt deleted file mode 100644 index e782add2e457..000000000000 --- a/Documentation/powerpc/dts-bindings/spi-bus.txt +++ /dev/null @@ -1,57 +0,0 @@ -SPI (Serial Peripheral Interface) busses - -SPI busses can be described with a node for the SPI master device -and a set of child nodes for each SPI slave on the bus. For this -discussion, it is assumed that the system's SPI controller is in -SPI master mode. This binding does not describe SPI controllers -in slave mode. - -The SPI master node requires the following properties: -- #address-cells - number of cells required to define a chip select - address on the SPI bus. -- #size-cells - should be zero. -- compatible - name of SPI bus controller following generic names - recommended practice. -No other properties are required in the SPI bus node. It is assumed -that a driver for an SPI bus device will understand that it is an SPI bus. -However, the binding does not attempt to define the specific method for -assigning chip select numbers. Since SPI chip select configuration is -flexible and non-standardized, it is left out of this binding with the -assumption that board specific platform code will be used to manage -chip selects. Individual drivers can define additional properties to -support describing the chip select layout. - -SPI slave nodes must be children of the SPI master node and can -contain the following properties. -- reg - (required) chip select address of device. -- compatible - (required) name of SPI device following generic names - recommended practice -- spi-max-frequency - (required) Maximum SPI clocking speed of device in Hz -- spi-cpol - (optional) Empty property indicating device requires - inverse clock polarity (CPOL) mode -- spi-cpha - (optional) Empty property indicating device requires - shifted clock phase (CPHA) mode -- spi-cs-high - (optional) Empty property indicating device requires - chip select active high - -SPI example for an MPC5200 SPI bus: - spi@f00 { - #address-cells = <1>; - #size-cells = <0>; - compatible = "fsl,mpc5200b-spi","fsl,mpc5200-spi"; - reg = <0xf00 0x20>; - interrupts = <2 13 0 2 14 0>; - interrupt-parent = <&mpc5200_pic>; - - ethernet-switch@0 { - compatible = "micrel,ks8995m"; - spi-max-frequency = <1000000>; - reg = <0>; - }; - - codec@1 { - compatible = "ti,tlv320aic26"; - spi-max-frequency = <100000>; - reg = <1>; - }; - }; diff --git a/Documentation/powerpc/dts-bindings/usb-ehci.txt b/Documentation/powerpc/dts-bindings/usb-ehci.txt deleted file mode 100644 index fa18612f757b..000000000000 --- a/Documentation/powerpc/dts-bindings/usb-ehci.txt +++ /dev/null @@ -1,25 +0,0 @@ -USB EHCI controllers - -Required properties: - - compatible : should be "usb-ehci". - - reg : should contain at least address and length of the standard EHCI - register set for the device. Optional platform-dependent registers - (debug-port or other) can be also specified here, but only after - definition of standard EHCI registers. - - interrupts : one EHCI interrupt should be described here. -If device registers are implemented in big endian mode, the device -node should have "big-endian-regs" property. -If controller implementation operates with big endian descriptors, -"big-endian-desc" property should be specified. -If both big endian registers and descriptors are used by the controller -implementation, "big-endian" property can be specified instead of having -both "big-endian-regs" and "big-endian-desc". - -Example (Sequoia 440EPx): - ehci@e0000300 { - compatible = "ibm,usb-ehci-440epx", "usb-ehci"; - interrupt-parent = <&UIC0>; - interrupts = <1a 4>; - reg = <0 e0000300 90 0 e0000390 70>; - big-endian; - }; diff --git a/Documentation/powerpc/dts-bindings/xilinx.txt b/Documentation/powerpc/dts-bindings/xilinx.txt deleted file mode 100644 index 299d0923537b..000000000000 --- a/Documentation/powerpc/dts-bindings/xilinx.txt +++ /dev/null @@ -1,306 +0,0 @@ - d) Xilinx IP cores - - The Xilinx EDK toolchain ships with a set of IP cores (devices) for use - in Xilinx Spartan and Virtex FPGAs. The devices cover the whole range - of standard device types (network, serial, etc.) and miscellaneous - devices (gpio, LCD, spi, etc). Also, since these devices are - implemented within the fpga fabric every instance of the device can be - synthesised with different options that change the behaviour. - - Each IP-core has a set of parameters which the FPGA designer can use to - control how the core is synthesized. Historically, the EDK tool would - extract the device parameters relevant to device drivers and copy them - into an 'xparameters.h' in the form of #define symbols. This tells the - device drivers how the IP cores are configured, but it requires the kernel - to be recompiled every time the FPGA bitstream is resynthesized. - - The new approach is to export the parameters into the device tree and - generate a new device tree each time the FPGA bitstream changes. The - parameters which used to be exported as #defines will now become - properties of the device node. In general, device nodes for IP-cores - will take the following form: - - (name): (generic-name)@(base-address) { - compatible = "xlnx,(ip-core-name)-(HW_VER)" - [, (list of compatible devices), ...]; - reg = <(baseaddr) (size)>; - interrupt-parent = <&interrupt-controller-phandle>; - interrupts = < ... >; - xlnx,(parameter1) = "(string-value)"; - xlnx,(parameter2) = <(int-value)>; - }; - - (generic-name): an open firmware-style name that describes the - generic class of device. Preferably, this is one word, such - as 'serial' or 'ethernet'. - (ip-core-name): the name of the ip block (given after the BEGIN - directive in system.mhs). Should be in lowercase - and all underscores '_' converted to dashes '-'. - (name): is derived from the "PARAMETER INSTANCE" value. - (parameter#): C_* parameters from system.mhs. The C_ prefix is - dropped from the parameter name, the name is converted - to lowercase and all underscore '_' characters are - converted to dashes '-'. - (baseaddr): the baseaddr parameter value (often named C_BASEADDR). - (HW_VER): from the HW_VER parameter. - (size): the address range size (often C_HIGHADDR - C_BASEADDR + 1). - - Typically, the compatible list will include the exact IP core version - followed by an older IP core version which implements the same - interface or any other device with the same interface. - - 'reg', 'interrupt-parent' and 'interrupts' are all optional properties. - - For example, the following block from system.mhs: - - BEGIN opb_uartlite - PARAMETER INSTANCE = opb_uartlite_0 - PARAMETER HW_VER = 1.00.b - PARAMETER C_BAUDRATE = 115200 - PARAMETER C_DATA_BITS = 8 - PARAMETER C_ODD_PARITY = 0 - PARAMETER C_USE_PARITY = 0 - PARAMETER C_CLK_FREQ = 50000000 - PARAMETER C_BASEADDR = 0xEC100000 - PARAMETER C_HIGHADDR = 0xEC10FFFF - BUS_INTERFACE SOPB = opb_7 - PORT OPB_Clk = CLK_50MHz - PORT Interrupt = opb_uartlite_0_Interrupt - PORT RX = opb_uartlite_0_RX - PORT TX = opb_uartlite_0_TX - PORT OPB_Rst = sys_bus_reset_0 - END - - becomes the following device tree node: - - opb_uartlite_0: serial@ec100000 { - device_type = "serial"; - compatible = "xlnx,opb-uartlite-1.00.b"; - reg = ; - interrupt-parent = <&opb_intc_0>; - interrupts = <1 0>; // got this from the opb_intc parameters - current-speed = ; // standard serial device prop - clock-frequency = ; // standard serial device prop - xlnx,data-bits = <8>; - xlnx,odd-parity = <0>; - xlnx,use-parity = <0>; - }; - - Some IP cores actually implement 2 or more logical devices. In - this case, the device should still describe the whole IP core with - a single node and add a child node for each logical device. The - ranges property can be used to translate from parent IP-core to the - registers of each device. In addition, the parent node should be - compatible with the bus type 'xlnx,compound', and should contain - #address-cells and #size-cells, as with any other bus. (Note: this - makes the assumption that both logical devices have the same bus - binding. If this is not true, then separate nodes should be used - for each logical device). The 'cell-index' property can be used to - enumerate logical devices within an IP core. For example, the - following is the system.mhs entry for the dual ps2 controller found - on the ml403 reference design. - - BEGIN opb_ps2_dual_ref - PARAMETER INSTANCE = opb_ps2_dual_ref_0 - PARAMETER HW_VER = 1.00.a - PARAMETER C_BASEADDR = 0xA9000000 - PARAMETER C_HIGHADDR = 0xA9001FFF - BUS_INTERFACE SOPB = opb_v20_0 - PORT Sys_Intr1 = ps2_1_intr - PORT Sys_Intr2 = ps2_2_intr - PORT Clkin1 = ps2_clk_rx_1 - PORT Clkin2 = ps2_clk_rx_2 - PORT Clkpd1 = ps2_clk_tx_1 - PORT Clkpd2 = ps2_clk_tx_2 - PORT Rx1 = ps2_d_rx_1 - PORT Rx2 = ps2_d_rx_2 - PORT Txpd1 = ps2_d_tx_1 - PORT Txpd2 = ps2_d_tx_2 - END - - It would result in the following device tree nodes: - - opb_ps2_dual_ref_0: opb-ps2-dual-ref@a9000000 { - #address-cells = <1>; - #size-cells = <1>; - compatible = "xlnx,compound"; - ranges = <0 a9000000 2000>; - // If this device had extra parameters, then they would - // go here. - ps2@0 { - compatible = "xlnx,opb-ps2-dual-ref-1.00.a"; - reg = <0 40>; - interrupt-parent = <&opb_intc_0>; - interrupts = <3 0>; - cell-index = <0>; - }; - ps2@1000 { - compatible = "xlnx,opb-ps2-dual-ref-1.00.a"; - reg = <1000 40>; - interrupt-parent = <&opb_intc_0>; - interrupts = <3 0>; - cell-index = <0>; - }; - }; - - Also, the system.mhs file defines bus attachments from the processor - to the devices. The device tree structure should reflect the bus - attachments. Again an example; this system.mhs fragment: - - BEGIN ppc405_virtex4 - PARAMETER INSTANCE = ppc405_0 - PARAMETER HW_VER = 1.01.a - BUS_INTERFACE DPLB = plb_v34_0 - BUS_INTERFACE IPLB = plb_v34_0 - END - - BEGIN opb_intc - PARAMETER INSTANCE = opb_intc_0 - PARAMETER HW_VER = 1.00.c - PARAMETER C_BASEADDR = 0xD1000FC0 - PARAMETER C_HIGHADDR = 0xD1000FDF - BUS_INTERFACE SOPB = opb_v20_0 - END - - BEGIN opb_uart16550 - PARAMETER INSTANCE = opb_uart16550_0 - PARAMETER HW_VER = 1.00.d - PARAMETER C_BASEADDR = 0xa0000000 - PARAMETER C_HIGHADDR = 0xa0001FFF - BUS_INTERFACE SOPB = opb_v20_0 - END - - BEGIN plb_v34 - PARAMETER INSTANCE = plb_v34_0 - PARAMETER HW_VER = 1.02.a - END - - BEGIN plb_bram_if_cntlr - PARAMETER INSTANCE = plb_bram_if_cntlr_0 - PARAMETER HW_VER = 1.00.b - PARAMETER C_BASEADDR = 0xFFFF0000 - PARAMETER C_HIGHADDR = 0xFFFFFFFF - BUS_INTERFACE SPLB = plb_v34_0 - END - - BEGIN plb2opb_bridge - PARAMETER INSTANCE = plb2opb_bridge_0 - PARAMETER HW_VER = 1.01.a - PARAMETER C_RNG0_BASEADDR = 0x20000000 - PARAMETER C_RNG0_HIGHADDR = 0x3FFFFFFF - PARAMETER C_RNG1_BASEADDR = 0x60000000 - PARAMETER C_RNG1_HIGHADDR = 0x7FFFFFFF - PARAMETER C_RNG2_BASEADDR = 0x80000000 - PARAMETER C_RNG2_HIGHADDR = 0xBFFFFFFF - PARAMETER C_RNG3_BASEADDR = 0xC0000000 - PARAMETER C_RNG3_HIGHADDR = 0xDFFFFFFF - BUS_INTERFACE SPLB = plb_v34_0 - BUS_INTERFACE MOPB = opb_v20_0 - END - - Gives this device tree (some properties removed for clarity): - - plb@0 { - #address-cells = <1>; - #size-cells = <1>; - compatible = "xlnx,plb-v34-1.02.a"; - device_type = "ibm,plb"; - ranges; // 1:1 translation - - plb_bram_if_cntrl_0: bram@ffff0000 { - reg = ; - } - - opb@20000000 { - #address-cells = <1>; - #size-cells = <1>; - ranges = <20000000 20000000 20000000 - 60000000 60000000 20000000 - 80000000 80000000 40000000 - c0000000 c0000000 20000000>; - - opb_uart16550_0: serial@a0000000 { - reg = ; - }; - - opb_intc_0: interrupt-controller@d1000fc0 { - reg = ; - }; - }; - }; - - That covers the general approach to binding xilinx IP cores into the - device tree. The following are bindings for specific devices: - - i) Xilinx ML300 Framebuffer - - Simple framebuffer device from the ML300 reference design (also on the - ML403 reference design as well as others). - - Optional properties: - - resolution = : pixel resolution of framebuffer. Some - implementations use a different resolution. - Default is - - virt-resolution = : Size of framebuffer in memory. - Default is . - - rotate-display (empty) : rotate display 180 degrees. - - ii) Xilinx SystemACE - - The Xilinx SystemACE device is used to program FPGAs from an FPGA - bitstream stored on a CF card. It can also be used as a generic CF - interface device. - - Optional properties: - - 8-bit (empty) : Set this property for SystemACE in 8 bit mode - - iii) Xilinx EMAC and Xilinx TEMAC - - Xilinx Ethernet devices. In addition to general xilinx properties - listed above, nodes for these devices should include a phy-handle - property, and may include other common network device properties - like local-mac-address. - - iv) Xilinx Uartlite - - Xilinx uartlite devices are simple fixed speed serial ports. - - Required properties: - - current-speed : Baud rate of uartlite - - v) Xilinx hwicap - - Xilinx hwicap devices provide access to the configuration logic - of the FPGA through the Internal Configuration Access Port - (ICAP). The ICAP enables partial reconfiguration of the FPGA, - readback of the configuration information, and some control over - 'warm boots' of the FPGA fabric. - - Required properties: - - xlnx,family : The family of the FPGA, necessary since the - capabilities of the underlying ICAP hardware - differ between different families. May be - 'virtex2p', 'virtex4', or 'virtex5'. - - vi) Xilinx Uart 16550 - - Xilinx UART 16550 devices are very similar to the NS16550 but with - different register spacing and an offset from the base address. - - Required properties: - - clock-frequency : Frequency of the clock input - - reg-offset : A value of 3 is required - - reg-shift : A value of 2 is required - - vii) Xilinx USB Host controller - - The Xilinx USB host controller is EHCI compatible but with a different - base address for the EHCI registers, and it is always a big-endian - USB Host controller. The hardware can be configured as high speed only, - or high speed/full speed hybrid. - - Required properties: - - xlnx,support-usb-fs: A value 0 means the core is built as high speed - only. A value 1 means the core also supports - full speed devices. - -- cgit v1.2.3 From cf4e5c6e8d2b87ae8e61168a7dc860d68c578745 Mon Sep 17 00:00:00 2001 From: Grant Likely Date: Mon, 31 Jan 2011 00:12:26 -0700 Subject: dt: Remove obsolete description of powerpc boot interface 32 and 64 bit powerpc support has been merged for a while now, but the booting-without-of.txt document still describes 32 bit as not supporting multiplatform, which is no longer true. This patch fixes the documentation. Also remove references to powerpc-specific details outside of section I in preparation to add details for other architectures. v3: cleaned up a lot more powerpc-isms and updated text to reflect current usage conventions. Signed-off-by: Grant Likely --- Documentation/devicetree/booting-without-of.txt | 165 ++++++++---------------- 1 file changed, 54 insertions(+), 111 deletions(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/booting-without-of.txt b/Documentation/devicetree/booting-without-of.txt index 7400d7555dc3..28b1c9d3d351 100644 --- a/Documentation/devicetree/booting-without-of.txt +++ b/Documentation/devicetree/booting-without-of.txt @@ -13,7 +13,6 @@ Table of Contents I - Introduction 1) Entry point for arch/powerpc - 2) Board support II - The DT block format 1) Header @@ -41,13 +40,6 @@ Table of Contents VI - System-on-a-chip devices and nodes 1) Defining child nodes of an SOC 2) Representing devices without a current OF specification - a) PHY nodes - b) Interrupt controllers - c) 4xx/Axon EMAC ethernet nodes - d) Xilinx IP cores - e) USB EHCI controllers - f) MDIO on GPIOs - g) SPI busses VII - Specifying interrupt information for devices 1) interrupts property @@ -123,7 +115,7 @@ Revision Information I - Introduction ================ -During the recent development of the Linux/ppc64 kernel, and more +During the development of the Linux/ppc64 kernel, and more specifically, the addition of new platform types outside of the old IBM pSeries/iSeries pair, it was decided to enforce some strict rules regarding the kernel entry and bootloader <-> kernel interfaces, in @@ -146,7 +138,7 @@ section III, but, for example, the kernel does not require you to create a node for every PCI device in the system. It is a requirement to have a node for PCI host bridges in order to provide interrupt routing informations and memory/IO ranges, among others. It is also -recommended to define nodes for on chip devices and other busses that +recommended to define nodes for on chip devices and other buses that don't specifically fit in an existing OF specification. This creates a great flexibility in the way the kernel can then probe those and match drivers to device, without having to hard code all sorts of tables. It @@ -158,7 +150,7 @@ it with special cases. 1) Entry point for arch/powerpc ------------------------------- - There is one and one single entry point to the kernel, at the start + There is one single entry point to the kernel, at the start of the kernel image. That entry point supports two calling conventions: @@ -210,12 +202,6 @@ it with special cases. with all CPUs. The way to do that with method b) will be described in a later revision of this document. - -2) Board support ----------------- - -64-bit kernels: - Board supports (platforms) are not exclusive config options. An arbitrary set of board supports can be built in a single kernel image. The kernel will "know" what set of functions to use for a @@ -234,48 +220,11 @@ it with special cases. containing the various callbacks that the generic code will use to get to your platform specific code - c) Add a reference to your "ppc_md" structure in the - "machines" table in arch/powerpc/kernel/setup_64.c if you are - a 64-bit platform. - - d) request and get assigned a platform number (see PLATFORM_* - constants in arch/powerpc/include/asm/processor.h - -32-bit embedded kernels: - - Currently, board support is essentially an exclusive config option. - The kernel is configured for a single platform. Part of the reason - for this is to keep kernels on embedded systems small and efficient; - part of this is due to the fact the code is already that way. In the - future, a kernel may support multiple platforms, but only if the + A kernel image may support multiple platforms, but only if the platforms feature the same core architecture. A single kernel build cannot support both configurations with Book E and configurations with classic Powerpc architectures. - 32-bit embedded platforms that are moved into arch/powerpc using a - flattened device tree should adopt the merged tree practice of - setting ppc_md up dynamically, even though the kernel is currently - built with support for only a single platform at a time. This allows - unification of the setup code, and will make it easier to go to a - multiple-platform-support model in the future. - -NOTE: I believe the above will be true once Ben's done with the merge -of the boot sequences.... someone speak up if this is wrong! - - To add a 32-bit embedded platform support, follow the instructions - for 64-bit platforms above, with the exception that the Kconfig - option should be set up such that the kernel builds exclusively for - the platform selected. The processor type for the platform should - enable another config option to select the specific board - supported. - -NOTE: If Ben doesn't merge the setup files, may need to change this to -point to setup_32.c - - - I will describe later the boot process and various callbacks that - your platform should implement. - II - The DT block format ======================== @@ -300,8 +249,8 @@ the block to RAM before passing it to the kernel. 1) Header --------- - The kernel is entered with r3 pointing to an area of memory that is - roughly described in arch/powerpc/include/asm/prom.h by the structure + The kernel is passed the physical address pointing to an area of memory + that is roughly described in include/linux/of_fdt.h by the structure boot_param_header: struct boot_param_header { @@ -339,7 +288,7 @@ struct boot_param_header { All values in this header are in big endian format, the various fields in this header are defined more precisely below. All "offset" values are in bytes from the start of the header; that is - from the value of r3. + from the physical base address of the device tree block. - magic @@ -437,7 +386,7 @@ struct boot_param_header { ------------------------------ - r3 -> | struct boot_param_header | + base -> | struct boot_param_header | ------------------------------ | (alignment gap) (*) | ------------------------------ @@ -457,7 +406,7 @@ struct boot_param_header { -----> ------------------------------ | | - --- (r3 + totalsize) + --- (base + totalsize) (*) The alignment gaps are not necessarily present; their presence and size are dependent on the various alignment requirements of @@ -500,7 +449,7 @@ the device-tree structure. It is typically used to represent "path" in the device-tree. More details about the actual format of these will be below. -The kernel powerpc generic code does not make any formal use of the +The kernel generic code does not make any formal use of the unit address (though some board support code may do) so the only real requirement here for the unit address is to ensure uniqueness of the node unit name at a given level of the tree. Nodes with no notion @@ -518,20 +467,21 @@ path to the root node is "/". Every node which actually represents an actual device (that is, a node which isn't only a virtual "container" for more nodes, like "/cpus" -is) is also required to have a "device_type" property indicating the -type of node . +is) is also required to have a "compatible" property indicating the +specific hardware and an optional list of devices it is fully +backwards compatible with. Finally, every node that can be referenced from a property in another -node is required to have a "linux,phandle" property. Real open -firmware implementations provide a unique "phandle" value for every -node that the "prom_init()" trampoline code turns into -"linux,phandle" properties. However, this is made optional if the -flattened device tree is used directly. An example of a node +node is required to have either a "phandle" or a "linux,phandle" +property. Real Open Firmware implementations provide a unique +"phandle" value for every node that the "prom_init()" trampoline code +turns into "linux,phandle" properties. However, this is made optional +if the flattened device tree is used directly. An example of a node referencing another node via "phandle" is when laying out the interrupt tree which will be described in a further version of this document. -This "linux, phandle" property is a 32-bit value that uniquely +The "phandle" property is a 32-bit value that uniquely identifies a node. You are free to use whatever values or system of values, internal pointers, or whatever to generate these, the only requirement is that every node for which you provide that property has @@ -694,7 +644,7 @@ made of 3 cells, the bottom two containing the actual address itself while the top cell contains address space indication, flags, and pci bus & device numbers. -For busses that support dynamic allocation, it's the accepted practice +For buses that support dynamic allocation, it's the accepted practice to then not provide the address in "reg" (keep it 0) though while providing a flag indicating the address is dynamically allocated, and then, to provide a separate "assigned-addresses" property that @@ -711,7 +661,7 @@ prom_parse.c file of the recent kernels for your bus type. The "reg" property only defines addresses and sizes (if #size-cells is non-0) within a given bus. In order to translate addresses upward (that is into parent bus addresses, and possibly into CPU physical -addresses), all busses must contain a "ranges" property. If the +addresses), all buses must contain a "ranges" property. If the "ranges" property is missing at a given level, it's assumed that translation isn't possible, i.e., the registers are not visible on the parent bus. The format of the "ranges" property for a bus is a list @@ -727,9 +677,9 @@ example, for a PCI host controller, that would be a CPU address. For a PCI<->ISA bridge, that would be a PCI address. It defines the base address in the parent bus where the beginning of that range is mapped. -For a new 64-bit powerpc board, I recommend either the 2/2 format or +For new 64-bit board support, I recommend either the 2/2 format or Apple's 2/1 format which is slightly more compact since sizes usually -fit in a single 32-bit word. New 32-bit powerpc boards should use a +fit in a single 32-bit word. New 32-bit board support should use a 1/1 format, unless the processor supports physical addresses greater than 32-bits, in which case a 2/1 format is recommended. @@ -754,7 +704,7 @@ of their actual names. While earlier users of Open Firmware like OldWorld macintoshes tended to use the actual device name for the "name" property, it's nowadays considered a good practice to use a name that is closer to the device -class (often equal to device_type). For example, nowadays, ethernet +class (often equal to device_type). For example, nowadays, Ethernet controllers are named "ethernet", an additional "model" property defining precisely the chip type/model, and "compatible" property defining the family in case a single driver can driver more than one @@ -772,7 +722,7 @@ is present). 4) Note about node and property names and character set ------------------------------------------------------- -While open firmware provides more flexible usage of 8859-1, this +While Open Firmware provides more flexible usage of 8859-1, this specification enforces more strict rules. Nodes and properties should be comprised only of ASCII characters 'a' to 'z', '0' to '9', ',', '.', '_', '+', '#', '?', and '-'. Node names additionally @@ -792,7 +742,7 @@ address which can extend beyond that limit. -------------------------------- These are all that are currently required. However, it is strongly recommended that you expose PCI host bridges as documented in the - PCI binding to open firmware, and your interrupt tree as documented + PCI binding to Open Firmware, and your interrupt tree as documented in OF interrupt tree specification. a) The root node @@ -802,20 +752,12 @@ address which can extend beyond that limit. - model : this is your board name/model - #address-cells : address representation for "root" devices - #size-cells: the size representation for "root" devices - - device_type : This property shouldn't be necessary. However, if - you decide to create a device_type for your root node, make sure it - is _not_ "chrp" unless your platform is a pSeries or PAPR compliant - one for 64-bit, or a CHRP-type machine for 32-bit as this will - matched by the kernel this way. - - Additionally, some recommended properties are: - - compatible : the board "family" generally finds its way here, for example, if you have 2 board models with a similar layout, that typically get driven by the same platform code in the - kernel, you would use a different "model" property but put a - value in "compatible". The kernel doesn't directly use that - value but it is generally useful. + kernel, you would specify the exact board model in the + compatible property followed by an entry that represents the SoC + model. The root node is also generally where you add additional properties specific to your board like the serial number if any, that sort of @@ -841,8 +783,11 @@ address which can extend beyond that limit. So under /cpus, you are supposed to create a node for every CPU on the machine. There is no specific restriction on the name of the - CPU, though It's common practice to call it PowerPC,. For + CPU, though it's common to call it ,. For example, Apple uses PowerPC,G5 while IBM uses PowerPC,970FX. + However, the Generic Names convention suggests that it would be + better to simply use 'cpu' for each cpu node and use the compatible + property to identify the specific cpu core. Required properties: @@ -923,7 +868,7 @@ compatibility. e) The /chosen node - This node is a bit "special". Normally, that's where open firmware + This node is a bit "special". Normally, that's where Open Firmware puts some variable environment information, like the arguments, or the default input/output devices. @@ -940,11 +885,7 @@ compatibility. console device if any. Typically, if you have serial devices on your board, you may want to put the full path to the one set as the default console in the firmware here, for the kernel to pick - it up as its own default console. If you look at the function - set_preferred_console() in arch/ppc64/kernel/setup.c, you'll see - that the kernel tries to find out the default console and has - knowledge of various types like 8250 serial ports. You may want - to extend this function to add your own. + it up as its own default console. Note that u-boot creates and fills in the chosen node for platforms that use it. @@ -955,23 +896,23 @@ compatibility. f) the /soc node - This node is used to represent a system-on-a-chip (SOC) and must be - present if the processor is a SOC. The top-level soc node contains - information that is global to all devices on the SOC. The node name - should contain a unit address for the SOC, which is the base address - of the memory-mapped register set for the SOC. The name of an soc + This node is used to represent a system-on-a-chip (SoC) and must be + present if the processor is a SoC. The top-level soc node contains + information that is global to all devices on the SoC. The node name + should contain a unit address for the SoC, which is the base address + of the memory-mapped register set for the SoC. The name of an SoC node should start with "soc", and the remainder of the name should represent the part number for the soc. For example, the MPC8540's soc node would be called "soc8540". Required properties: - - device_type : Should be "soc" - ranges : Should be defined as specified in 1) to describe the - translation of SOC addresses for memory mapped SOC registers. - - bus-frequency: Contains the bus frequency for the SOC node. + translation of SoC addresses for memory mapped SoC registers. + - bus-frequency: Contains the bus frequency for the SoC node. Typically, the value of this field is filled in by the boot loader. + - compatible : Exact model of the SoC Recommended properties: @@ -1155,12 +1096,13 @@ while all this has been defined and implemented. - An example of code for iterating nodes & retrieving properties directly from the flattened tree format can be found in the kernel - file arch/ppc64/kernel/prom.c, look at scan_flat_dt() function, + file drivers/of/fdt.c. Look at the of_scan_flat_dt() function, its usage in early_init_devtree(), and the corresponding various early_init_dt_scan_*() callbacks. That code can be re-used in a GPL bootloader, and as the author of that code, I would be happy to discuss possible free licensing to any vendor who wishes to integrate all or part of this code into a non-GPL bootloader. + (reference needed; who is 'I' here? ---gcl Jan 31, 2011) @@ -1203,18 +1145,19 @@ MPC8540. 2) Representing devices without a current OF specification ---------------------------------------------------------- -Currently, there are many devices on SOCs that do not have a standard -representation pre-defined as part of the open firmware -specifications, mainly because the boards that contain these SOCs are -not currently booted using open firmware. This section contains -descriptions for the SOC devices for which new nodes have been -defined; this list will expand as more and more SOC-containing -platforms are moved over to use the flattened-device-tree model. +Currently, there are many devices on SoCs that do not have a standard +representation defined as part of the Open Firmware specifications, +mainly because the boards that contain these SoCs are not currently +booted using Open Firmware. Binding documentation for new devices +should be added to the Documentation/devicetree/bindings directory. +That directory will expand as device tree support is added to more and +more SoCs. + VII - Specifying interrupt information for devices =================================================== -The device tree represents the busses and devices of a hardware +The device tree represents the buses and devices of a hardware system in a form similar to the physical bus topology of the hardware. -- cgit v1.2.3 From 9830fcd6f6a4781d8b46d2b35c13b39f30915c63 Mon Sep 17 00:00:00 2001 From: Grant Likely Date: Mon, 31 Jan 2011 00:09:58 -0700 Subject: dt: add documentation of ARM dt boot interface v3: added details to Documentation/arm/Booting Signed-off-by: Grant Likely --- Documentation/arm/Booting | 33 +++++++++++++++++--- Documentation/devicetree/booting-without-of.txt | 40 +++++++++++++++++++++++++ 2 files changed, 69 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/arm/Booting b/Documentation/arm/Booting index 76850295af8f..4e686a2ed91e 100644 --- a/Documentation/arm/Booting +++ b/Documentation/arm/Booting @@ -65,13 +65,19 @@ looks at the connected hardware is beyond the scope of this document. The boot loader must ultimately be able to provide a MACH_TYPE_xxx value to the kernel. (see linux/arch/arm/tools/mach-types). - -4. Setup the kernel tagged list -------------------------------- +4. Setup boot data +------------------ Existing boot loaders: OPTIONAL, HIGHLY RECOMMENDED New boot loaders: MANDATORY +The boot loader must provide either a tagged list or a dtb image for +passing configuration data to the kernel. The physical address of the +boot data is passed to the kernel in register r2. + +4a. Setup the kernel tagged list +-------------------------------- + The boot loader must create and initialise the kernel tagged list. A valid tagged list starts with ATAG_CORE and ends with ATAG_NONE. The ATAG_CORE tag may or may not be empty. An empty ATAG_CORE tag @@ -101,6 +107,24 @@ The tagged list must be placed in a region of memory where neither the kernel decompressor nor initrd 'bootp' program will overwrite it. The recommended placement is in the first 16KiB of RAM. +4b. Setup the device tree +------------------------- + +The boot loader must load a device tree image (dtb) into system ram +at a 64bit aligned address and initialize it with the boot data. The +dtb format is documented in Documentation/devicetree/booting-without-of.txt. +The kernel will look for the dtb magic value of 0xd00dfeed at the dtb +physical address to determine if a dtb has been passed instead of a +tagged list. + +The boot loader must pass at a minimum the size and location of the +system memory, and the root filesystem location. The dtb must be +placed in a region of memory where the kernel decompressor will not +overwrite it. The recommended placement is in the first 16KiB of RAM +with the caveat that it may not be located at physical address 0 since +the kernel interprets a value of 0 in r2 to mean neither a tagged list +nor a dtb were passed. + 5. Calling the kernel image --------------------------- @@ -125,7 +149,8 @@ In either case, the following conditions must be met: - CPU register settings r0 = 0, r1 = machine type number discovered in (3) above. - r2 = physical address of tagged list in system RAM. + r2 = physical address of tagged list in system RAM, or + physical address of device tree block (dtb) in system RAM - CPU mode All forms of interrupts must be disabled (IRQs and FIQs) diff --git a/Documentation/devicetree/booting-without-of.txt b/Documentation/devicetree/booting-without-of.txt index 28b1c9d3d351..9381a1481027 100644 --- a/Documentation/devicetree/booting-without-of.txt +++ b/Documentation/devicetree/booting-without-of.txt @@ -13,6 +13,7 @@ Table of Contents I - Introduction 1) Entry point for arch/powerpc + 2) Entry point for arch/arm II - The DT block format 1) Header @@ -225,6 +226,45 @@ it with special cases. cannot support both configurations with Book E and configurations with classic Powerpc architectures. +2) Entry point for arch/arm +--------------------------- + + There is one single entry point to the kernel, at the start + of the kernel image. That entry point supports two calling + conventions. A summary of the interface is described here. A full + description of the boot requirements is documented in + Documentation/arm/Booting + + a) ATAGS interface. Minimal information is passed from firmware + to the kernel with a tagged list of predefined parameters. + + r0 : 0 + + r1 : Machine type number + + r2 : Physical address of tagged list in system RAM + + b) Entry with a flattened device-tree block. Firmware loads the + physical address of the flattened device tree block (dtb) into r2, + r1 is not used, but it is considered good practise to use a valid + machine number as described in Documentation/arm/Booting. + + r0 : 0 + + r1 : Valid machine type number. When using a device tree, + a single machine type number will often be assigned to + represent a class or family of SoCs. + + r2 : physical pointer to the device-tree block + (defined in chapter II) in RAM. Device tree can be located + anywhere in system RAM, but it should be aligned on a 32 bit + boundary. + + The kernel will differentiate between ATAGS and device tree booting by + reading the memory pointed to by r1 and looking for either the flattened + device tree block magic value (0xd00dfeed) or the ATAG_CORE value at + offset 0x4 from r2 (0x54410001). + II - The DT block format ======================== -- cgit v1.2.3 From 1e1dbb259c79b38a542c1c4c00fd8dfe936b183b Mon Sep 17 00:00:00 2001 From: Javi Merino Date: Mon, 31 Jan 2011 23:11:36 +0000 Subject: sched, docs: Update schedstats documentation to version 15 Version 15 of schedstats was introduced in: 67aa0f767af4: sched: remove unused fields from struct rq and removed three unused counters in sched_yield(). Update the documentation. Signed-off-by: Javi Merino Cc: henrix@sapo.pt Cc: rdunlap@xenotime.net Cc: Peter Zijlstra Cc: Mike Galbraith LKML-Reference: <1296515496-8229-1-git-send-email-cibervicho@gmail.com> Signed-off-by: Ingo Molnar --- Documentation/scheduler/sched-stats.txt | 33 +++++++++++++++------------------ 1 file changed, 15 insertions(+), 18 deletions(-) (limited to 'Documentation') diff --git a/Documentation/scheduler/sched-stats.txt b/Documentation/scheduler/sched-stats.txt index 01e69404ee5e..1cd5d51bc761 100644 --- a/Documentation/scheduler/sched-stats.txt +++ b/Documentation/scheduler/sched-stats.txt @@ -1,3 +1,7 @@ +Version 15 of schedstats dropped counters for some sched_yield: +yld_exp_empty, yld_act_empty and yld_both_empty. Otherwise, it is +identical to version 14. + Version 14 of schedstats includes support for sched_domains, which hit the mainline kernel in 2.6.20 although it is identical to the stats from version 12 which was in the kernel from 2.6.13-2.6.19 (version 13 never saw a kernel @@ -28,32 +32,25 @@ to write their own scripts, the fields are described here. CPU statistics -------------- -cpu 1 2 3 4 5 6 7 8 9 10 11 12 - -NOTE: In the sched_yield() statistics, the active queue is considered empty - if it has only one process in it, since obviously the process calling - sched_yield() is that process. +cpu 1 2 3 4 5 6 7 8 9 -First four fields are sched_yield() statistics: - 1) # of times both the active and the expired queue were empty - 2) # of times just the active queue was empty - 3) # of times just the expired queue was empty - 4) # of times sched_yield() was called +First field is a sched_yield() statistic: + 1) # of times sched_yield() was called Next three are schedule() statistics: - 5) # of times we switched to the expired queue and reused it - 6) # of times schedule() was called - 7) # of times schedule() left the processor idle + 2) # of times we switched to the expired queue and reused it + 3) # of times schedule() was called + 4) # of times schedule() left the processor idle Next two are try_to_wake_up() statistics: - 8) # of times try_to_wake_up() was called - 9) # of times try_to_wake_up() was called to wake up the local cpu + 5) # of times try_to_wake_up() was called + 6) # of times try_to_wake_up() was called to wake up the local cpu Next three are statistics describing scheduling latency: - 10) sum of all time spent running by tasks on this processor (in jiffies) - 11) sum of all time spent waiting to run by tasks on this processor (in + 7) sum of all time spent running by tasks on this processor (in jiffies) + 8) sum of all time spent waiting to run by tasks on this processor (in jiffies) - 12) # of timeslices run on this cpu + 9) # of timeslices run on this cpu Domain statistics -- cgit v1.2.3 From 34a6ef381d402c6547aa9abb8a74b0262ae8255f Mon Sep 17 00:00:00 2001 From: Peter Chubb Date: Wed, 2 Feb 2011 15:39:58 -0800 Subject: tcp_ecn is an integer not a boolean There was some confusion at LCA as to why the sysctl tcp_ecn took one of three values when it was documented as a Boolean. This patch fixes the documentation. Signed-off-by: Peter Chubb Signed-off-by: David S. Miller --- Documentation/networking/ip-sysctl.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index d99940dcfc44..ac3b4a726a1a 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -187,7 +187,7 @@ tcp_cookie_size - INTEGER tcp_dsack - BOOLEAN Allows TCP to send "duplicate" SACKs. -tcp_ecn - BOOLEAN +tcp_ecn - INTEGER Enable Explicit Congestion Notification (ECN) in TCP. ECN is only used when both ends of the TCP flow support it. It is useful to avoid losses due to congestion (when the bottleneck router supports -- cgit v1.2.3 From 552b372ba9db85751e7db2998f07cca2e51f5865 Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Tue, 1 Feb 2011 15:52:31 -0800 Subject: memsw: deprecate noswapaccount kernel parameter and schedule it for removal noswapaccount couldn't be used to control memsw for both on/off cases so we have added swapaccount[=0|1] parameter. This way we can turn the feature in two ways noswapaccount resp. swapaccount=0. We have kept the original noswapaccount but I think we should remove it after some time as it just makes more command line parameters without any advantages and also the code to handle parameters is uglier if we want both parameters. Signed-off-by: Michal Hocko Requested-by: KAMEZAWA Hiroyuki Acked-by: KAMEZAWA Hiroyuki Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/feature-removal-schedule.txt | 16 ++++++++++++++++ mm/memcontrol.c | 1 + 2 files changed, 17 insertions(+) (limited to 'Documentation') diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index b959659c5df4..b3f35e5f9c95 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -603,3 +603,19 @@ Why: The adm9240, w83792d and w83793 hardware monitoring drivers have Who: Jean Delvare ---------------------------- + +What: noswapaccount kernel command line parameter +When: 2.6.40 +Why: The original implementation of memsw feature enabled by + CONFIG_CGROUP_MEM_RES_CTLR_SWAP could be disabled by the noswapaccount + kernel parameter (introduced in 2.6.29-rc1). Later on, this decision + turned out to be not ideal because we cannot have the feature compiled + in and disabled by default and let only interested to enable it + (e.g. general distribution kernels might need it). Therefore we have + added swapaccount[=0|1] parameter (introduced in 2.6.37) which provides + the both possibilities. If we remove noswapaccount we will have + less command line parameters with the same functionality and we + can also cleanup the parameter handling a bit (). +Who: Michal Hocko + +---------------------------- diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 44f9f9c89f0c..79abb1fd39d2 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5034,6 +5034,7 @@ __setup("swapaccount", enable_swap_account); static int __init disable_swap_account(char *s) { + printk_once("noswapaccount is deprecated and will be removed in 2.6.40. Use swapaccount=0 instead\n"); enable_swap_account("=0"); return 1; } -- cgit v1.2.3