>-----Original Message----- >From: Yazen Ghannam <yazen.ghannam@xxxxxxx> >Sent: 11 June 2025 20:02 >To: Shiju Jose <shiju.jose@xxxxxxxxxx> >Cc: rafael@xxxxxxxxxx; linux-edac@xxxxxxxxxxxxxxx; linux-acpi@xxxxxxxxxxxxxxx; >linux-doc@xxxxxxxxxxxxxxx; bp@xxxxxxxxx; tony.luck@xxxxxxxxx; >lenb@xxxxxxxxxx; leo.duran@xxxxxxx; mchehab@xxxxxxxxxx; Jonathan >Cameron <jonathan.cameron@xxxxxxxxxx>; linux-mm@xxxxxxxxx; Linuxarm ><linuxarm@xxxxxxxxxx>; rientjes@xxxxxxxxxx; jiaqiyan@xxxxxxxxxx; >Jon.Grimm@xxxxxxx; dave.hansen@xxxxxxxxxxxxxxx; >naoya.horiguchi@xxxxxxx; james.morse@xxxxxxx; jthoughton@xxxxxxxxxx; >somasundaram.a@xxxxxxx; erdemaktas@xxxxxxxxxx; pgonda@xxxxxxxxxx; >duenwen@xxxxxxxxxx; gthelen@xxxxxxxxxx; >wschwartz@xxxxxxxxxxxxxxxxxxx; dferguson@xxxxxxxxxxxxxxxxxxx; >wbs@xxxxxxxxxxxxxxxxxxxxxx; nifan.cxl@xxxxxxxxx; tanxiaofei ><tanxiaofei@xxxxxxxxxx>; Zengtao (B) <prime.zeng@xxxxxxxxxxxxx>; Roberto >Sassu <roberto.sassu@xxxxxxxxxx>; kangkang.shen@xxxxxxxxxxxxx; >wanghuiqiang <wanghuiqiang@xxxxxxxxxx> >Subject: Re: [PATCH v8 1/2] ACPI:RAS2: Add ACPI RAS2 driver > >On Tue, Jun 10, 2025 at 01:37:23PM +0100, shiju.jose@xxxxxxxxxx wrote: >> From: Shiju Jose <shiju.jose@xxxxxxxxxx> >> >> Add support for ACPI RAS2 feature table (RAS2) defined in the ACPI 6.5 >> Specification, section 5.2.21. >> Driver defines RAS2 Init, which extracts the RAS2 table and driver >> adds auxiliary device for each memory feature which binds to the >> RAS2 memory driver. >> >> Driver uses PCC mailbox to communicate with the ACPI HW and the driver >> adds OSPM interfaces to send RAS2 commands. >> >> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> >> Co-developed-by: A Somasundaram <somasundaram.a@xxxxxxx> >> Signed-off-by: A Somasundaram <somasundaram.a@xxxxxxx> >> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> >> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> >> Tested-by: Daniel Ferguson <danielf@xxxxxxxxxxxxxxxxxxxxxx> >> Signed-off-by: Shiju Jose <shiju.jose@xxxxxxxxxx> >> --- >> drivers/acpi/Kconfig | 11 ++ >> drivers/acpi/Makefile | 1 + >> drivers/acpi/bus.c | 3 + >> drivers/acpi/ras2.c | 450 ++++++++++++++++++++++++++++++++++++++++++ >> include/acpi/ras2.h | 54 +++++ >> 5 files changed, 519 insertions(+) >> create mode 100644 drivers/acpi/ras2.c create mode 100644 >> include/acpi/ras2.h >> >> diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index >> 7bc40c2735ac..846b27b49024 100644 >> --- a/drivers/acpi/Kconfig >> +++ b/drivers/acpi/Kconfig >> @@ -293,6 +293,17 @@ config ACPI_CPPC_LIB >> If your platform does not support CPPC in firmware, >> leave this option disabled. >> >> +config ACPI_RAS2 >> + bool "ACPI RAS2 driver" >> + select AUXILIARY_BUS >> + select MAILBOX >> + select PCC >> + help >> + The driver adds support for ACPI RAS2 feature table(extracts RAS2 >> + table from OS system table) and OSPM interfaces to send RAS2 >> + commands via PCC mailbox subspace. Driver adds platform device for >> + the RAS2 memory features which binds to the RAS2 memory driver. >> + >> config ACPI_PROCESSOR >> tristate "Processor" >> depends on X86 || ARM64 || LOONGARCH || RISCV diff --git >> a/drivers/acpi/Makefile b/drivers/acpi/Makefile index >> d1b0affb844f..abfec6745724 100644 >> --- a/drivers/acpi/Makefile >> +++ b/drivers/acpi/Makefile >> @@ -105,6 +105,7 @@ obj-$(CONFIG_ACPI_EC_DEBUGFS) += ec_sys.o >> obj-$(CONFIG_ACPI_BGRT) += bgrt.o >> obj-$(CONFIG_ACPI_CPPC_LIB) += cppc_acpi.o >> obj-$(CONFIG_ACPI_SPCR_TABLE) += spcr.o >> +obj-$(CONFIG_ACPI_RAS2) += ras2.o >> obj-$(CONFIG_ACPI_DEBUGGER_USER) += acpi_dbg.o >> obj-$(CONFIG_ACPI_PPTT) += pptt.o >> obj-$(CONFIG_ACPI_PFRUT) += pfr_update.o pfr_telemetry.o >> diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c index >> c2ab2783303f..5b4f04a4611c 100644 >> --- a/drivers/acpi/bus.c >> +++ b/drivers/acpi/bus.c >> @@ -31,6 +31,7 @@ >> #include <acpi/apei.h> >> #include <linux/suspend.h> >> #include <linux/prmt.h> >> +#include <acpi/ras2.h> >> >> #include "internal.h" >> >> @@ -1474,6 +1475,8 @@ static int __init acpi_init(void) >> acpi_debugger_init(); >> acpi_setup_sb_notify_handler(); >> acpi_viot_init(); >> + acpi_ras2_init(); >> + >> return 0; >> } >> >> diff --git a/drivers/acpi/ras2.c b/drivers/acpi/ras2.c new file mode >> 100644 index 000000000000..97ca3feac2fe >> --- /dev/null >> +++ b/drivers/acpi/ras2.c >> @@ -0,0 +1,450 @@ >> +// SPDX-License-Identifier: GPL-2.0-only >> +/* >> + * Implementation of ACPI RAS2 driver. >> + * >> + * Copyright (c) 2024-2025 HiSilicon Limited. >> + * >> + * Support for RAS2 - ACPI 6.5 Specification, section 5.2.21 >> + * >> + * Driver contains ACPI RAS2 init, which extracts the ACPI RAS2 table >> +and >> + * get the PCC channel subspace for communicating with the ACPI >> +compliant >> + * HW platform which supports ACPI RAS2. Driver adds auxiliary >> +devices >> + * for each RAS2 memory feature which binds to the memory ACPI RAS2 >driver. >> + */ >> + >> +#define pr_fmt(fmt) "ACPI RAS2: " fmt >> + >> +#include <linux/delay.h> >> +#include <linux/export.h> >> +#include <linux/iopoll.h> >> +#include <linux/ktime.h> >> +#include <acpi/pcc.h> >> +#include <acpi/ras2.h> >> + >> +static struct acpi_table_ras2 *__read_mostly ras2_tab; >> + >> +/** >> + * struct ras2_pcc_subspace - Data structure for PCC communication >> + * @mbox_client: struct mbox_client object >> + * @pcc_chan: Pointer to struct pcc_mbox_chan >> + * @comm_addr: Pointer to RAS2 PCC shared memory region >> + * @elem: List for registered RAS2 PCC channel subspaces >> + * @pcc_lock: PCC lock to provide mutually exclusive access >> + * to PCC channel subspace >> + * @deadline_us: Poll PCC status register timeout in micro secs >> + * for PCC command complete >> + * @pcc_mpar: Maximum Periodic Access Rate(MPAR) for PCC >channel >> + * @pcc_mrtt: Minimum Request Turnaround Time(MRTT) in >micro secs >> + * OS must wait after completion of a PCC command >before >> + * issue next command >> + * @last_cmd_cmpl_time: completion time of last PCC command >> + * @last_mpar_reset: Time of last MPAR count reset. >> + * @mpar_count: MPAR count >> + * @pcc_id: Identifier of the RAS2 platform communication channel >> + * @pcc_chnl_acq: Status of PCC channel acquired. >> + * @kref: kref object >> + */ >> +struct ras2_pcc_subspace { >> + struct mbox_client mbox_client; >> + struct pcc_mbox_chan *pcc_chan; >> + struct acpi_ras2_shmem __iomem *comm_addr; >> + struct list_head elem; >> + struct mutex pcc_lock; >> + unsigned int deadline_us; >> + unsigned int pcc_mpar; >> + unsigned int pcc_mrtt; >> + ktime_t last_cmd_cmpl_time; >> + ktime_t last_mpar_reset; >> + int mpar_count; >> + int pcc_id; >> + bool pcc_chnl_acq; >> + struct kref kref; >> +}; >> + >> +/* >> + * Arbitrary retries for PCC commands because the remote processor >> + * could be much slower to reply. Keeping it high enough to cover >> + * emulators where the processors run painfully slow. >> + */ >> +#define RAS2_NUM_RETRIES 600ULL >> + >> +#define RAS2_FEAT_TYPE_MEMORY 0x00 >> + >> +/* Static variables for the RAS2 PCC subspaces */ static >> +DEFINE_MUTEX(ras2_pcc_list_lock); >> +static LIST_HEAD(ras2_pcc_subspaces); >> + >> +static int ras2_report_cap_error(u32 cap_status) { >> + switch (cap_status) { >> + case ACPI_RAS2_NOT_VALID: >> + case ACPI_RAS2_NOT_SUPPORTED: >> + return -EPERM; >> + case ACPI_RAS2_BUSY: >> + return -EBUSY; >> + case ACPI_RAS2_FAILED: >> + case ACPI_RAS2_ABORTED: >> + case ACPI_RAS2_INVALID_DATA: >> + return -EINVAL; >> + default: /* 0 or other, Success */ >> + return 0; >> + } >> +} >> + >> +static int ras2_check_pcc_chan(struct ras2_pcc_subspace >> +*pcc_subspace) { >> + struct acpi_ras2_shmem __iomem *gen_comm_base = pcc_subspace- >>comm_addr; >> + u32 cap_status; >> + u16 status; >> + u32 rc; > >The u32 variables can be on the same line. Hi Yazen, Thanks for the comments. Will do. > >> + >> + /* >> + * As per ACPI spec, the PCC space will be initialized by >> + * platform and should have set the command completion bit when >> + * PCC can be used by OSPM. >> + * >> + * Poll PCC status register every 3us(delay_us) for maximum of >> + * deadline_us(timeout_us) until PCC command complete bit is >set(cond). >> + */ >> + rc = readw_relaxed_poll_timeout(&gen_comm_base->status, status, >> + status & >PCC_STATUS_CMD_COMPLETE, 3, >> + pcc_subspace->deadline_us); >> + if (rc) { >> + pr_warn("PCC check channel failed for : %d rc=%d\n", > >The first "%d" will look like just a random number to the user. So the message >should include something like "pcc_id %d". > >Also, the only failure here (I think) would be a timeout. So the message could >state that explicitly. Will do. > >> + pcc_subspace->pcc_id, rc); >> + return rc; >> + } >> + >> + if (status & PCC_STATUS_ERROR) { > >The spec says >"If set, an error occurred executing the last command." > >So this sounds like a feedback for the last command rather than indicating a >communication channel error. > >Maybe the PCC driver can save the "last command", and then this error >condition can have a message to the user. Will check and change. > >> + status &= ~PCC_STATUS_ERROR; >> + writew_relaxed(status, &gen_comm_base->status); >> + return -EIO; >> + } >> + >> + if (!(status & PCC_STATUS_CMD_COMPLETE)) >> + return -EIO; > >This condition is impossible. Either this bit is set or we timed out earlier. Ok. Will remove. > >> + >> + cap_status = readw_relaxed(&gen_comm_base->set_caps_status); >> + writew_relaxed(0x0, &gen_comm_base->set_caps_status); >> + return ras2_report_cap_error(cap_status); >> +} >> + >> +/** >> + * ras2_send_pcc_cmd() - Send RAS2 command via PCC channel >> + * @ras2_ctx: pointer to the RAS2 context structure >> + * @cmd: command to send >> + * >> + * Returns: 0 on success, an error otherwise */ int >> +ras2_send_pcc_cmd(struct ras2_mem_ctx *ras2_ctx, u16 cmd) { >> + struct ras2_pcc_subspace *pcc_subspace = ras2_ctx->pcc_subspace; >> + struct acpi_ras2_shmem __iomem *gen_comm_base = pcc_subspace- >>comm_addr; >> + struct mbox_chan *pcc_channel; >> + unsigned int time_delta; >> + int rc; >> + >> + rc = ras2_check_pcc_chan(pcc_subspace); >> + if (rc < 0) >> + return rc; >> + >> + pcc_channel = pcc_subspace->pcc_chan->mchan; >> + >> + /* >> + * Handle the Minimum Request Turnaround Time(MRTT). >> + * "The minimum amount of time that OSPM must wait after the >completion >> + * of a command before issuing the next command, in microseconds." >> + */ >> + if (pcc_subspace->pcc_mrtt) { >> + time_delta = ktime_us_delta(ktime_get(), >> + pcc_subspace- >>last_cmd_cmpl_time); >> + if (pcc_subspace->pcc_mrtt > time_delta) >> + udelay(pcc_subspace->pcc_mrtt - time_delta); >> + } >> + >> + /* >> + * Handle the non-zero Maximum Periodic Access Rate(MPAR). >> + * "The maximum number of periodic requests that the subspace >channel can >> + * support, reported in commands per minute. 0 indicates no limitation." >> + * >> + * This parameter should be ideally zero or large enough so that it can >> + * handle maximum number of requests that all the cores in the system >can >> + * collectively generate. If it is not, we will follow the spec and just >> + * not send the request to the platform after hitting the MPAR limit in >> + * any 60s window. >> + */ >> + if (pcc_subspace->pcc_mpar) { >> + if (pcc_subspace->mpar_count == 0) { >> + time_delta = ktime_ms_delta(ktime_get(), >> + pcc_subspace- >>last_mpar_reset); >> + if (time_delta < 60 * MSEC_PER_SEC) { >> + dev_dbg(ras2_ctx->dev, >> + "PCC cmd not sent due to MPAR limit"); >> + return -EIO; >> + } >> + pcc_subspace->last_mpar_reset = ktime_get(); >> + pcc_subspace->mpar_count = pcc_subspace- >>pcc_mpar; >> + } >> + pcc_subspace->mpar_count--; >> + } >> + >> + /* Write to the shared comm region */ >> + writew_relaxed(cmd, &gen_comm_base->command); >> + >> + /* Flip CMD COMPLETE bit */ >> + writew_relaxed(0, &gen_comm_base->status); >> + >> + /* Ring doorbell */ >> + rc = mbox_send_message(pcc_channel, &cmd); >> + if (rc < 0) { >> + dev_warn(ras2_ctx->dev, >> + "Err sending PCC mbox message. cmd:%d, rc:%d\n", >cmd, rc); >> + return rc; >> + } >> + >> + /* >> + * If Minimum Request Turnaround Time is non-zero, we need >> + * to record the completion time of both READ and WRITE >> + * command for proper handling of MRTT, so we need to check >> + * for pcc_mrtt in addition to CMD_READ. >> + */ >> + if (cmd == PCC_CMD_EXEC_RAS2 || pcc_subspace->pcc_mrtt) { >> + rc = ras2_check_pcc_chan(pcc_subspace); >> + if (pcc_subspace->pcc_mrtt) >> + pcc_subspace->last_cmd_cmpl_time = ktime_get(); >> + } >> + >> + if (pcc_channel->mbox->txdone_irq) >> + mbox_chan_txdone(pcc_channel, rc); >> + else >> + mbox_client_txdone(pcc_channel, rc); >> + >> + return rc >= 0 ? 0 : rc; >> +} >> +EXPORT_SYMBOL_GPL(ras2_send_pcc_cmd); >> + >> +static void ras2_list_pcc_release(struct kref *kref) { >> + struct ras2_pcc_subspace *pcc_subspace = >> + container_of(kref, struct ras2_pcc_subspace, kref); >> + >> + guard(mutex)(&ras2_pcc_list_lock); >> + list_del(&pcc_subspace->elem); >> + pcc_mbox_free_channel(pcc_subspace->pcc_chan); >> + kfree(pcc_subspace); >> +} >> + >> +static void ras2_pcc_get(struct ras2_pcc_subspace *pcc_subspace) { >> + kref_get(&pcc_subspace->kref); >> +} >> + >> +static void ras2_pcc_put(struct ras2_pcc_subspace *pcc_subspace) { >> + kref_put(&pcc_subspace->kref, &ras2_list_pcc_release); } >> + >> +static struct ras2_pcc_subspace *ras2_get_pcc_subspace(int pcc_id) { >> + struct ras2_pcc_subspace *pcc_subspace; >> + >> + mutex_lock(&ras2_pcc_list_lock); > >Use "guard" here? Sure. > >> + list_for_each_entry(pcc_subspace, &ras2_pcc_subspaces, elem) { >> + if (pcc_subspace->pcc_id != pcc_id) >> + continue; >> + ras2_pcc_get(pcc_subspace); >> + mutex_unlock(&ras2_pcc_list_lock); >> + return pcc_subspace; >> + } >> + mutex_unlock(&ras2_pcc_list_lock); >> + >> + return NULL; >> +} >> + >> +static int ras2_register_pcc_channel(struct ras2_mem_ctx *ras2_ctx, >> +int pcc_id) { >> + struct ras2_pcc_subspace *pcc_subspace; >> + struct pcc_mbox_chan *pcc_chan; >> + struct mbox_client *mbox_cl; >> + >> + if (pcc_id < 0) >> + return -EINVAL; >> + >> + pcc_subspace = ras2_get_pcc_subspace(pcc_id); >> + if (pcc_subspace) { >> + ras2_ctx->pcc_subspace = pcc_subspace; >> + ras2_ctx->comm_addr = pcc_subspace->comm_addr; >> + ras2_ctx->dev = pcc_subspace->pcc_chan->mchan- >>mbox->dev; >> + ras2_ctx->pcc_lock = &pcc_subspace->pcc_lock; >> + return 0; >> + } >> + >> + pcc_subspace = kzalloc(sizeof(*pcc_subspace), GFP_KERNEL); >> + if (!pcc_subspace) >> + return -ENOMEM; >> + >> + mbox_cl = &pcc_subspace->mbox_client; >> + mbox_cl->knows_txdone = true; >> + >> + pcc_chan = pcc_mbox_request_channel(mbox_cl, pcc_id); >> + if (IS_ERR(pcc_chan)) { >> + kfree(pcc_subspace); >> + return PTR_ERR(pcc_chan); >> + } >> + >> + pcc_subspace->pcc_id = pcc_id; >> + pcc_subspace->pcc_chan = pcc_chan; >> + pcc_subspace->comm_addr = acpi_os_ioremap(pcc_chan- >>shmem_base_addr, >> + pcc_chan- >>shmem_size); >> + pcc_subspace->deadline_us = RAS2_NUM_RETRIES * pcc_chan- >>latency; >> + pcc_subspace->pcc_mrtt = pcc_chan- >>min_turnaround_time; >> + pcc_subspace->pcc_mpar = pcc_chan->max_access_rate; >> + pcc_subspace->mbox_client.knows_txdone = true; >> + pcc_subspace->pcc_chnl_acq = true; >> + >> + kref_init(&pcc_subspace->kref); >> + >> + mutex_lock(&ras2_pcc_list_lock); >> + list_add(&pcc_subspace->elem, &ras2_pcc_subspaces); >> + ras2_pcc_get(pcc_subspace); >> + mutex_unlock(&ras2_pcc_list_lock); >> + >> + ras2_ctx->pcc_subspace = pcc_subspace; >> + ras2_ctx->comm_addr = pcc_subspace->comm_addr; >> + ras2_ctx->dev = pcc_chan->mchan->mbox->dev; >> + >> + mutex_init(&pcc_subspace->pcc_lock); >> + ras2_ctx->pcc_lock = &pcc_subspace->pcc_lock; >> + >> + return 0; >> +} >> + >> +static DEFINE_IDA(ras2_ida); >> +static void ras2_release(struct device *device) { >> + struct auxiliary_device *auxdev = container_of(device, struct >auxiliary_device, dev); >> + struct ras2_mem_ctx *ras2_ctx = container_of(auxdev, struct >> +ras2_mem_ctx, adev); >> + >> + ida_free(&ras2_ida, auxdev->id); >> + ras2_pcc_put(ras2_ctx->pcc_subspace); >> + kfree(ras2_ctx); >> +} >> + >> +static int ras2_add_aux_device(char *name, int channel) { >> + struct ras2_mem_ctx *ras2_ctx; >> + int id, rc; >> + >> + ras2_ctx = kzalloc(sizeof(*ras2_ctx), GFP_KERNEL); >> + if (!ras2_ctx) >> + return -ENOMEM; >> + >> + rc = ras2_register_pcc_channel(ras2_ctx, channel); >> + if (rc < 0) { >> + pr_debug("failed to register pcc channel rc=%d\n", rc); >> + goto ctx_free; >> + } >> + >> + id = ida_alloc(&ras2_ida, GFP_KERNEL); >> + if (id < 0) { >> + rc = id; >> + goto pcc_free; >> + } >> + >> + ras2_ctx->id = id; >> + ras2_ctx->adev.id = id; >> + ras2_ctx->adev.name = RAS2_MEM_DEV_ID_NAME; >> + ras2_ctx->adev.dev.release = ras2_release; >> + ras2_ctx->adev.dev.parent = ras2_ctx->dev; >> + >> + rc = auxiliary_device_init(&ras2_ctx->adev); >> + if (rc) >> + goto ida_free; >> + >> + rc = auxiliary_device_add(&ras2_ctx->adev); >> + if (rc) { >> + auxiliary_device_uninit(&ras2_ctx->adev); >> + return rc; >> + } >> + >> + return 0; >> + >> +ida_free: >> + ida_free(&ras2_ida, id); >> +pcc_free: >> + ras2_pcc_put(ras2_ctx->pcc_subspace); >> +ctx_free: >> + kfree(ras2_ctx); >> + >> + return rc; >> +} > >Just curious, would the new "faux" bus work for this use case? I tried with "faux" interface, following are the observations. 1. RAS2 scrub control interface do present and work under /sys/bus/faux/devices/... as expected. However it does not appear under /sys/bus/edac/devices/... as required. It seems that the reason for this is that, with the faux device, the RAS2 faux probe function and thus edac_dev_register() is called during the early boot stage, compared to the probe of the present auxiliary device based solution. I think EDAC is not fully initialized at that boot stage? This issue is solved with late_initcall(acpi_ras2_init) instead of calling acpi_ras2_init() from acpi_init() (drivers/acpi/bus.c). 2. The faux_device_create() function takes a struct faux_device_ops *, which requires the faux_device_ops for RAS2 to be defined within this driver (drivers/acpi/ras2.c). Therefore, registration with EDAC for RAS features may necessitate moving to a *ras2_faux_probe() function here. This would also require moving the struct edac_scrub_ops ras2_scrub_ops and the corresponding callback functions from drivers/ras/acpi_ras2.c to this location. Alternatively, we could add and export a ras2_faux_init() function in drivers/ras/acpi_ras2.c and call it from *ras2_faux_probe(). Thus not sure "faux" is a suitable interface for RAS2? > Thanks, Shiju