[晓理紫]每日论文分享(有中文摘要，源码或项目地址)--强化学习、模仿学习、机器人

专属领域论文订阅

VX 关注{晓理紫|}，每日更新论文，如感兴趣，请转发给有需要的同学，谢谢支持

如果你感觉对你有所帮助，请关注我，每日准时为你推送最新论文。

为了答谢各位网友的支持，从今日起免费为300名读者提供订阅主题论文服务，只需VX关注公号并回复{邮箱+论文主题}（如：[email protected] + chatgpt@large language model @LLM）,主题必须是同一个领域，最多三个关键词。解释权归博主所有

分类:

大语言模型LLM

视觉模型VLM

扩散模型

视觉语言导航VLN

强化学习 RL

模仿学习 IL

机器人

开放词汇，检测分割

== RL ==

标题: M2CURL: Sample-Efficient Multimodal Reinforcement Learning via Self-Supervised Representation Learning for Robotic Manipulation

作者: Fotios Lygerakis, Vedant Dave, Elmar Rueckert

PubTime: 2024-01-30

Downlink: http://arxiv.org/abs/2401.17032v1

Project: https://sites.google.com/view/M2CURL/home|

中文摘要: 多模态强化学习（RL）最关键的一个方面是不同观察模态的有效集成。从这些模态中获得鲁棒和准确的表示是增强RL算法的鲁棒性和采样效率的关键。然而，在视觉触觉数据的RL设置中学习表征提出了重大挑战，特别是由于数据的高维度以及将视觉和触觉输入与动态环境和任务目标相关联所涉及的复杂性。为了应对这些挑战，我们提出了多模态对比无监督强化学习（M2CURL）。我们的方法采用了一种新的多模态自监督学习技术，该技术学习有效的表示，并有助于RL算法的更快收敛。我们的方法与RL算法不可知，因此能够与任何可用的RL算法集成。我们在触觉健身房2模拟器上评估了M2CURL，我们表明它显著提高了不同操作任务的学习效率。与没有我们的表示学习方法的标准RL算法相比，每集更快的收敛速度和更高的累积回报证明了这一点。

摘要: One of the most critical aspects of multimodal Reinforcement Learning (RL) is the effective integration of different observation modalities. Having robust and accurate representations derived from these modalities is key to enhancing the robustness and sample efficiency of RL algorithms. However, learning representations in RL settings for visuotactile data poses significant challenges, particularly due to the high dimensionality of the data and the complexity involved in correlating visual and tactile inputs with the dynamic environment and task objectives. To address these challenges, we propose Multimodal Contrastive Unsupervised Reinforcement Learning (M2CURL). Our approach employs a novel multimodal self-supervised learning technique that learns efficient representations and contributes to faster convergence of RL algorithms. Our method is agnostic to the RL algorithm, thus enabling its integration with any available RL algorithm. We evaluate M2CURL on the Tactile Gym 2 simulator and we show that it significantly enhances the learning efficiency in different manipulation tasks. This is evidenced by faster convergence rates and higher cumulative rewards per episode, compared to standard RL algorithms without our representation learning approach.

标题: CORE: Towards Scalable and Efficient Causal Discovery with Reinforcement Learning

作者: Andreas W. M. Sauter, Nicolò Botteghi, Erman Acar

PubTime: 2024-01-30

Downlink: http://arxiv.org/abs/2401.16974v1

GitHub: https://github.com/sa-and/CORE|

中文摘要: 因果发现是从数据中推断因果结构的挑战性任务。珀尔的因果层次（PCH）告诉我们，仅靠被动观察不足以区分相关性和因果关系，受此激励，最近有一股将干预纳入机器学习研究的趋势。强化学习为这种积极的学习方法提供了一个方便的框架。本文介绍了CORE，一种基于深度强化学习的因果发现和干预计划方法。CORE学习从数据中顺序重建因果图，同时学习执行信息干预。我们的结果表明，核心推广到看不见的图，并有效地揭示因果结构。此外，核心扩展到具有多达10个变量的更大的图，并且在结构估计精度和样本效率方面优于现有的方法。所有相关代码和补充材料均可在https：//github.com/sa-and/CORE

摘要: Causal discovery is the challenging task of inferring causal structure from data. Motivated by Pearl’s Causal Hierarchy (PCH), which tells us that passive observations alone are not enough to distinguish correlation from causation, there has been a recent push to incorporate interventions into machine learning research. Reinforcement learning provides a convenient framework for such an active approach to learning. This paper presents CORE, a deep reinforcement learning-based approach for causal discovery and intervention planning. CORE learns to sequentially reconstruct causal graphs from data while learning to perform informative interventions. Our results demonstrate that CORE generalizes to unseen graphs and efficiently uncovers causal structures. Furthermore, CORE scales to larger graphs with up to 10 variables and outperforms existing approaches in structure estimation accuracy and sample efficiency. All relevant code and supplementary material can be found at https://github.com/sa-and/CORE

标题: Reinforcement Unlearning

作者: Dayong Ye, Tianqing Zhu, Congcong Zhu

PubTime: 2024-01-30

Downlink: http://arxiv.org/abs/2312.15910v2

Project: https://anonymous.4open.science/r/Reinforcement-Unlearning-D347|

摘要: Machine unlearning refers to the process of mitigating the influence of specific training data on machine learning models based on removal requests from data owners. However, one important area that has been largely overlooked in the research of unlearning is reinforcement learning. Reinforcement learning focuses on training an agent to make optimal decisions within an environment to maximize its cumulative rewards. During the training, the agent tends to memorize the features of the environment, which raises a significant concern about privacy. As per data protection regulations, the owner of the environment holds the right to revoke access to the agent’s training data, thus necessitating the development of a novel and pressing research field, known as \emph{reinforcement unlearning}. Reinforcement unlearning focuses on revoking entire environments rather than individual data samples. This unique characteristic presents three distinct challenges: 1) how to propose unlearning schemes for environments; 2) how to avoid degrading the agent’s performance in remaining environments; and 3) how to evaluate the effectiveness of unlearning. To tackle these challenges, we propose two reinforcement unlearning methods. The first method is based on decremental reinforcement learning, which aims to erase the agent’s previously acquired knowledge gradually. The second method leverages environment poisoning attacks, which encourage the agent to learn new, albeit incorrect, knowledge to remove the unlearning environment. Particularly, to tackle the third challenge, we introduce the concept of ``environment inference attack’’ to evaluate the unlearning outcomes. The source code is available at \url{https://anonymous.4open.science/r/Reinforcement-Unlearning-D347}.

标题: Optimal service resource management strategy for IoT-based health information system considering value co-creation of users

作者: Ji Fang, Vincent CS Lee, Haiyan Wang

PubTime: 2024-01-30

Downlink: http://arxiv.org/abs/2204.02521v2

Project: https://doi.org/10.1108/IMDS-03-2023-0173|

中文摘要: 本文探讨了最优服务资源管理策略，这是卫生信息服务在提高服务绩效、优化服务资源利用和提供交互式卫生信息服务方面面临的持续挑战。考虑健康信息服务中的价值共创模型，开发了一种自适应最优服务资源管理策略，重点关注与用户的协作和互动。深度强化学习算法被嵌入到基于物联网（IoT）的健康信息服务系统（I-HISS）中，通过基于用户参与行为控制服务提供和服务适应来分配服务资源。进行仿真实验以评估所提出的算法在不同用户对健康信息服务的反应下的重要性。

摘要: This paper explores optimal service resource management strategy, a continuous challenge for health information service to enhance service performance, optimise service resource utilisation and deliver interactive health information service. An adaptive optimal service resource management strategy was developed considering a value co-creation model in health information service with a focus on collaborative and interactive with users. The deep reinforcement learning algorithm was embedded in the Internet of Things (IoT)-based health information service system (I-HISS) to allocate service resources by controlling service provision and service adaptation based on user engagement behaviour. The simulation experiments were conducted to evaluate the significance of the proposed algorithm under different user reactions to the health information service.

标题: A comparison of RL-based and PID controllers for 6-DOF swimming robots: hybrid underwater object tracking

作者: Faraz Lotfi, Khalil Virji, Nicholas Dudek

PubTime: 2024-01-29

Downlink: http://arxiv.org/abs/2401.16618v1

GitHub: https://github.com/FARAZLOTFI/underwater-object-tracking|

中文摘要: 在本文中，我们对采用集中式深度Q网络（DQN）控制器作为6DOF游泳机器人环境中普遍使用的PID控制器的替代进行了探索和评估。我们的主要焦点集中在用水下物体跟踪的具体例子来说明这种转变。DQN提供了数据效率和非策略学习等优势，同时比其他强化学习方法更容易实现。鉴于我们的机器人缺乏动态模型，我们提出了一个RL代理来控制这个多输入多输出（MIMO）系统，其中集中式控制器可以提供比不同PID更鲁棒的控制。我们的方法包括最初使用经典控制器进行安全探索，然后逐渐转移到DQN来完全控制机器人。将水下跟踪任务分为视觉和控制两个模块。我们使用已建立的基于视觉的跟踪方法，并引入了集中式DQN控制器。通过将边界框数据从视觉模块传输到控制模块，我们能够适应各种对象并轻松更换视觉系统。此外，处理低维数据有助于控制器的成本有效的在线学习。我们在基于Unity的模拟器中进行的实验验证了集中式RL代理相对于分离式PID控制器的有效性，展示了我们的框架对于训练水下RL代理的适用性，以及与传统控制方法相比改进的性能。真实和模拟实现的代码都在https://github.com/FARAZLOTFI/underated-object-tracking。

摘要: In this paper, we present an exploration and assessment of employing a centralized deep Q-network (DQN) controller as a substitute for the prevalent use of PID controllers in the context of 6DOF swimming robots. Our primary focus centers on illustrating this transition with the specific case of underwater object tracking. DQN offers advantages such as data efficiency and off-policy learning, while remaining simpler to implement than other reinforcement learning methods. Given the absence of a dynamic model for our robot, we propose an RL agent to control this multi-input-multi-output (MIMO) system, where a centralized controller may offer more robust control than distinct PIDs. Our approach involves initially using classical controllers for safe exploration, then gradually shifting to DQN to take full control of the robot. We divide the underwater tracking task into vision and control modules. We use established methods for vision-based tracking and introduce a centralized DQN controller. By transmitting bounding box data from the vision module to the control module, we enable adaptation to various objects and effortless vision system replacement. Furthermore, dealing with low-dimensional data facilitates cost-effective online learning for the controller. Our experiments, conducted within a Unity-based simulator, validate the effectiveness of a centralized RL agent over separated PID controllers, showcasing the applicability of our framework for training the underwater RL agent and improved performance compared to traditional control methods. The code for both real and simulation implementations is at https://github.com/FARAZLOTFI/underwater-object-tracking.

标题: Zero-Shot Reinforcement Learning via Function Encoders

作者: Tyler Ingebrand, Amy Zhang, Ufuk Topcu

PubTime: 2024-01-30

Downlink: http://arxiv.org/abs/2401.17173v1

摘要: Although reinforcement learning (RL) can solve many challenging sequential decision making problems, achieving zero-shot transfer across related tasks remains a challenge. The difficulty lies in finding a good representation for the current task so that the agent understands how it relates to previously seen tasks. To achieve zero-shot transfer, we introduce the function encoder, a representation learning algorithm which represents a function as a weighted combination of learned, non-linear basis functions. By using a function encoder to represent the reward function or the transition function, the agent has information on how the current task relates to previously seen tasks via a coherent vector representation. Thus, the agent is able to achieve transfer between related tasks at run time with no additional training. We demonstrate state-of-the-art data efficiency, asymptotic performance, and training stability in three RL fields by augmenting basic RL algorithms with a function encoder task representation.

== Imitation Learning ==

标题: Dual RL: Unification and New Methods for Reinforcement and Imitation Learning

作者: Harshit Sikchi, Qinqing Zheng, Amy Zhang

PubTime: 2024-01-26

Downlink: http://arxiv.org/abs/2302.08560v3

Project: https://hari-sikchi.github.io/dual-rl|

中文摘要: 强化学习（RL）的目标是找到一个最大化预期累积回报的策略。已经表明，这个目标可以表示为线性约束下状态——行动访问分配的优化问题。这个公式的对偶问题，我们称之为对偶RL，是无约束的，更容易优化。在这项工作中，我们首先将几个最先进的离线RL和离线模仿学习（IL）算法作为具有共享结构的双重RL方法的实例。这种统一使我们能够确定现有方法缺点的根本原因。对于离线IL，我们的分析表明，现有的方法是基于限制性覆盖假设，这极大地限制了它们在实践中的性能。为了解决这一限制，我们提出了一种新的无鉴别器方法反冲，该方法从任意非策略数据中学习模仿，以获得接近专家的性能。对于离线RL，我们的分析将最近的离线RL方法XQL框架在对偶框架中，我们进一步提出了一种新的方法f-DVL，它为Gumbel回归损失提供了替代选择，修复了XQL的已知训练不稳定性问题。我们提出的两种方法反冲和f-DVL在IL和RL中的性能改进在大量模拟机器人运动和操纵任务中得到验证。项目代码和细节可以在这个https：//hari-sikchi.github.io/dual-rl。找到

摘要: The goal of reinforcement learning (RL) is to find a policy that maximizes the expected cumulative return. It has been shown that this objective can be represented as an optimization problem of state-action visitation distribution under linear constraints. The dual problem of this formulation, which we refer to as dual RL, is unconstrained and easier to optimize. In this work, we first cast several state-of-the-art offline RL and offline imitation learning (IL) algorithms as instances of dual RL approaches with shared structures. Such unification allows us to identify the root cause of the shortcomings of prior methods. For offline IL, our analysis shows that prior methods are based on a restrictive coverage assumption that greatly limits their performance in practice. To fix this limitation, we propose a new discriminator-free method ReCOIL that learns to imitate from arbitrary off-policy data to obtain near-expert performance. For offline RL, our analysis frames a recent offline RL method XQL in the dual framework, and we further propose a new method f-DVL that provides alternative choices to the Gumbel regression loss that fixes the known training instability issue of XQL. The performance improvements by both of our proposed methods, ReCOIL and f-DVL, in IL and RL are validated on an extensive suite of simulated robot locomotion and manipulation tasks. Project code and details can be found at this https://hari-sikchi.github.io/dual-rl.

标题: Multi-task robot data for dual-arm fine manipulation

作者: Heecheol Kim, Yoshiyuki Ohmura, Yasuo Kuniyoshi

PubTime: 2024-01-26

Downlink: http://arxiv.org/abs/2401.07603v2

Project: https://sites.google.com/view/multi-task-fine|https://sites.google.com/view/multi-task-fine|

中文摘要: 在机器人操纵领域，深度模仿学习被认为是获得操纵技能的一种有前途的方法。此外，从不同的机器人数据集学习被认为是实现多功能性和适应性的可行方法。在这样的研究中，通过学习各种任务，机器人实现了跨多个对象的通用性。然而，这种多任务机器人数据集主要集中在相对不精确的单臂任务上，而没有解决机器人在现实世界中预期执行的细粒度对象操作。本文介绍了一个不同对象操作的数据集，包括双臂任务和/或需要精细操作的任务。为此，我们生成了224k集（150小时，1,104种语言指令）的数据集，其中包括双臂精细任务，如移动碗、打开铅笔盒或剥香蕉，这些数据是公开可用的。此外，该数据集包括视觉注意力信号以及双动作标签，该信号将动作分成稳健的到达轨迹和与对象的精确交互，以及实现稳健和精确的对象操作的语言指令。我们将该数据集应用于我们的双动作和注意力（DAA），这是一个为细粒度双臂操作任务设计的模型，对协变量偏移具有鲁棒性。该模型在实际机器人操作任务中进行了超过7k次试验，证明了其精细操作能力。该数据集可在https://sites.google.com/view/multi-task-fine查阅。

摘要: In the field of robotic manipulation, deep imitation learning is recognized as a promising approach for acquiring manipulation skills. Additionally, learning from diverse robot datasets is considered a viable method to achieve versatility and adaptability. In such research, by learning various tasks, robots achieved generality across multiple objects. However, such multi-task robot datasets have mainly focused on single-arm tasks that are relatively imprecise, not addressing the fine-grained object manipulation that robots are expected to perform in the real world. This paper introduces a dataset of diverse object manipulations that includes dual-arm tasks and/or tasks requiring fine manipulation. To this end, we have generated dataset with 224k episodes (150 hours, 1,104 language instructions) which includes dual-arm fine tasks such as bowl-moving, pencil-case opening or banana-peeling, and this data is publicly available. Additionally, this dataset includes visual attention signals as well as dual-action labels, a signal that separates actions into a robust reaching trajectory and precise interaction with objects, and language instructions to achieve robust and precise object manipulation. We applied the dataset to our Dual-Action and Attention (DAA), a model designed for fine-grained dual arm manipulation tasks and robust against covariate shifts. The model was tested with over 7k total trials in real robot manipulation tasks, demonstrating its capability in fine manipulation. The dataset is available at https://sites.google.com/view/multi-task-fine.

标题: Interpretable Imitation Learning with Dynamic Causal Relations

作者: Tianxiang Zhao, Wenchao Yu, Suhang Wang

PubTime: 2024-01-30

Downlink: http://arxiv.org/abs/2310.00489v4

中文摘要: 模仿学习通过模仿专家演示来学习代理策略，在医疗制度和自动驾驶汽车等许多应用中显示出有希望的结果。然而，解释代理学习的控制策略仍然是一项困难的任务。困难主要来自两个方面：1）模仿学习中的agent通常实现为深度神经网络，是黑盒模型，缺乏可解释性；2）代理决策背后的潜在因果机制可能会沿轨迹变化，而不是在整个时间步骤中保持静止。为了增加透明度并提供神经代理更好的可解释性，我们建议以有向无环因果图的形式公开其捕获的知识，节点是动作和状态变量，边表示预测背后的因果关系。此外，我们将这个因果发现过程设计为状态相关的，使其能够对潜在因果图中的动态进行建模。具体来说，我们从格兰杰因果关系的角度进行因果发现，并提出一个可自我解释的模仿学习框架{\method}。所提出的框架由三部分组成：动态因果发现模块、因果编码模块和预测模块，并以端到端的方式进行训练。在模型被学习之后，我们可以获得其决策背后的状态和行动变量之间的因果关系，暴露它所学习的政策。在合成和真实世界数据集上的实验结果证明了所提出的{\method}在学习动态因果图以理解模仿学习的决策同时保持高预测精度方面的有效性。

摘要: Imitation learning, which learns agent policy by mimicking expert demonstration, has shown promising results in many applications such as medical treatment regimes and self-driving vehicles. However, it remains a difficult task to interpret control policies learned by the agent. Difficulties mainly come from two aspects: 1) agents in imitation learning are usually implemented as deep neural networks, which are black-box models and lack interpretability; 2) the latent causal mechanism behind agents’ decisions may vary along the trajectory, rather than staying static throughout time steps. To increase transparency and offer better interpretability of the neural agent, we propose to expose its captured knowledge in the form of a directed acyclic causal graph, with nodes being action and state variables and edges denoting the causal relations behind predictions. Furthermore, we design this causal discovery process to be state-dependent, enabling it to model the dynamics in latent causal graphs. Concretely, we conduct causal discovery from the perspective of Granger causality and propose a self-explainable imitation learning framework, {\method}. The proposed framework is composed of three parts: a dynamic causal discovery module, a causality encoding module, and a prediction module, and is trained in an end-to-end manner. After the model is learned, we can obtain causal relations among states and action variables behind its decisions, exposing policies learned by it. Experimental results on both synthetic and real-world datasets demonstrate the effectiveness of the proposed {\method} in learning the dynamic causal graphs for understanding the decision-making of imitation learning meanwhile maintaining high prediction accuracy.

标题: Extrinsicaly Rewarded Soft Q Imitation Learning with Discriminator

作者: Ryoma Furuyama, Daiki Kuyoshi, Satoshi Yamane

PubTime: 2024-01-30

Downlink: http://arxiv.org/abs/2401.16772v1

中文摘要: 在奖励设计困难或奖励稀疏的环境中，除了强化学习之外，还经常使用模仿学习，但很难从少量的专家数据和采样数据中很好地模仿未知状态。行为克隆等监督学习方法不需要采样数据，但通常会出现分布偏移。基于强化学习的方法，如逆强化学习和生成对抗模仿学习（GAIL），只能从少数专家数据中学习。然而，它们经常需要与环境互动。软Q模仿学习（SQIL）解决了这些问题，并表明它可以通过将行为克隆和软Q学习与持续奖励相结合来有效地学习。为了使该算法对分布转移更鲁棒，我们提出了更有效和鲁棒的算法，通过向该方法添加基于对抗逆强化学习的奖励函数，该奖励函数奖励代理在类似于演示的状态下执行动作。我们称这种算法为鉴别器软Q模仿学习（DSQIL）。我们在MuJoCo环境中对其进行了评估。

摘要: Imitation learning is often used in addition to reinforcement learning in environments where reward design is difficult or where the reward is sparse, but it is difficult to be able to imitate well in unknown states from a small amount of expert data and sampling data. Supervised learning methods such as Behavioral Cloning do not require sampling data, but usually suffer from distribution shift. The methods based on reinforcement learning, such as inverse reinforcement learning and Generative Adversarial imitation learning (GAIL), can learn from only a few expert data. However, they often need to interact with the environment. Soft Q imitation learning (SQIL) addressed the problems, and it was shown that it could learn efficiently by combining Behavioral Cloning and soft Q-learning with constant rewards. In order to make this algorithm more robust to distribution shift, we propose more efficient and robust algorithm by adding to this method a reward function based on adversarial inverse reinforcement learning that rewards the agent for performing actions in status similar to the demo. We call this algorithm Discriminator Soft Q Imitation Learning (DSQIL). We evaluated it on MuJoCo environments.

标题: ILBiT: Imitation Learning for Robot Using Position and Torque Information based on Bilateral Control with Transformer

作者: Masato Kobayashi, Thanpimon Buamanee, Yuki Uranishi

PubTime: 2024-01-30

Downlink: http://arxiv.org/abs/2401.16653v1

摘要: Autonomous manipulation in robot arms is a complex and evolving field of study in robotics. This paper introduces an innovative approach to this challenge by focusing on imitation learning (IL). Unlike traditional imitation methods, our approach uses IL based on bilateral control, allowing for more precise and adaptable robot movements. The conventional IL based on bilateral control method have relied on Long Short-Term Memory (LSTM) networks. In this paper, we present the IL for robot using position and torque information based on Bilateral control with Transformer (ILBiT). This proposed method employs the Transformer model, known for its robust performance in handling diverse datasets and its capability to surpass LSTM’s limitations, especially in tasks requiring detailed force adjustments. A standout feature of ILBiT is its high-frequency operation at 100 Hz, which significantly improves the system’s adaptability and response to varying environments and objects of different hardness levels. The effectiveness of the Transformer-based ILBiT method can be seen through comprehensive real-world experiments.

标题: Inverse Reinforcement Learning without Reinforcement Learning

作者: Gokul Swamy, Sanjiban Choudhury, J. Andrew Bagnell

PubTime: 2024-01-29

Downlink: http://arxiv.org/abs/2303.14623v4

摘要: Inverse Reinforcement Learning (IRL) is a powerful set of techniques for imitation learning that aims to learn a reward function that rationalizes expert demonstrations. Unfortunately, traditional IRL methods suffer from a computational weakness: they require repeatedly solving a hard reinforcement learning (RL) problem as a subroutine. This is counter-intuitive from the viewpoint of reductions: we have reduced the easier problem of imitation learning to repeatedly solving the harder problem of RL. Another thread of work has proved that access to the side-information of the distribution of states where a strong policy spends time can dramatically reduce the sample and computational complexities of solving an RL problem. In this work, we demonstrate for the first time a more informed imitation learning reduction where we utilize the state distribution of the expert to alleviate the global exploration component of the RL subroutine, providing an exponential speedup in theory. In practice, we find that we are able to significantly speed up the prior art on continuous control tasks.

== robotic agent==

标题: Generative Expressive Robot Behaviors using Large Language Models

作者: Karthik Mahadevan, Jonathan Chien, Noah Brown

PubTime: 2024-01-30

Downlink: http://arxiv.org/abs/2401.14673v2

Project: https://generative-expressive-motion.github.io/|

中文摘要: 人们使用表达行为来有效地与他人交流和协调他们的行动，例如点头表示对瞥他们一眼的人的认可，或者在繁忙的走廊上说“对不起”从人群中经过。我们希望机器人也能在人机交互中表现出富有表现力的行为。先前的工作提出了基于规则的方法，这些方法很难扩展到新的通信模式或社交场合，而数据驱动的方法需要针对机器人所处的每个社交场合的专门数据集。我们建议利用大型语言模型（LLMs）提供的丰富的社会背景及其基于指令或用户偏好生成运动的能力，来生成具有适应性和可组合性的机器人运动，并相互构建。我们的方法利用机器人可用和学习的技能，利用少量思维链提示将人类语言指令翻译成参数化的控制代码。通过用户研究和模拟实验，我们证明了我们的方法产生了用户认为有能力和容易理解的行为。补充材料可以在https：//generative-expressive-motion.github.io/。

摘要: People employ expressive behaviors to effectively communicate and coordinate their actions with others, such as nodding to acknowledge a person glancing at them or saying “excuse me” to pass people in a busy corridor. We would like robots to also demonstrate expressive behaviors in human-robot interaction. Prior work proposes rule-based methods that struggle to scale to new communication modalities or social situations, while data-driven methods require specialized datasets for each social situation the robot is used in. We propose to leverage the rich social context available from large language models (LLMs) and their ability to generate motion based on instructions or user preferences, to generate expressive robot motion that is adaptable and composable, building upon each other. Our approach utilizes few-shot chain-of-thought prompting to translate human language instructions into parametrized control code using the robot’s available and learned skills. Through user studies and simulation experiments, we demonstrate that our approach produces behaviors that users found to be competent and easy to understand. Supplementary material can be found at https://generative-expressive-motion.github.io/.

标题: Towards Unified Interactive Visual Grounding in The Wild

作者: Jie Xu, Hanbo Zhang, Qingyi Si

PubTime: 2024-01-30

Downlink: http://arxiv.org/abs/2401.16699v1

GitHub: https://github.com/jxu124/TiO|

摘要: Interactive visual grounding in Human-Robot Interaction (HRI) is challenging yet practical due to the inevitable ambiguity in natural languages. It requires robots to disambiguate the user input by active information gathering. Previous approaches often rely on predefined templates to ask disambiguation questions, resulting in performance reduction in realistic interactive scenarios. In this paper, we propose TiO, an end-to-end system for interactive visual grounding in human-robot interaction. Benefiting from a unified formulation of visual dialogue and grounding, our method can be trained on a joint of extensive public data, and show superior generality to diversified and challenging open-world scenarios. In the experiments, we validate TiO on GuessWhat?! and InViG benchmarks, setting new state-of-the-art performance by a clear margin. Moreover, we conduct HRI experiments on the carefully selected 150 challenging scenes as well as real-robot platforms. Results show that our method demonstrates superior generality to diversified visual and language inputs with a high success rate. Codes and demos are available at https://github.com/jxu124/TiO.

标题: MResT: Multi-Resolution Sensing for Real-Time Control with Vision-Language Models

作者: Saumya Saxena, Mohit Sharma, Oliver Kroemer

PubTime: 2024-01-25

Downlink: http://arxiv.org/abs/2401.14502v1

Project: http://tinyurl.com/multi-res-realtime-control|

中文摘要: 利用不同空间和时间分辨率的传感模式可以提高机器人操纵任务的性能。多空间分辨率传感提供了在不同空间尺度下捕获的分层信息，并支持粗略和精确的运动。同时，多时间分辨率传感使代理能够表现出高反应性和实时控制。在这项工作中，我们提出了一个框架，MResT（多分辨率Transformer model），用于学习可概括的语言条件多任务策略，该策略利用不同空间和时间分辨率的传感，使用不同容量的网络来有效地执行精确和反应性任务的实时控制。我们利用现成的预训练视觉语言模型来操作低频全局特征，以及小型非预训练模型来适应高频局部反馈。通过在3个领域（粗略、精确和动态操作任务）的广泛实验，我们表明我们的方法在最近的多任务基线上显著改进（平均2倍）。此外，我们的方法可以很好地推广到目标物体的视觉和几何变化以及不同的相互作用力。

摘要: Leveraging sensing modalities across diverse spatial and temporal resolutions can improve performance of robotic manipulation tasks. Multi-spatial resolution sensing provides hierarchical information captured at different spatial scales and enables both coarse and precise motions. Simultaneously multi-temporal resolution sensing enables the agent to exhibit high reactivity and real-time control. In this work, we propose a framework, MResT (Multi-Resolution Transformer), for learning generalizable language-conditioned multi-task policies that utilize sensing at different spatial and temporal resolutions using networks of varying capacities to effectively perform real time control of precise and reactive tasks. We leverage off-the-shelf pretrained vision-language models to operate on low-frequency global features along with small non-pretrained models to adapt to high frequency local feedback. Through extensive experiments in 3 domains (coarse, precise and dynamic manipulation tasks), we show that our approach significantly improves (2X on average) over recent multi-task baselines. Further, our approach generalizes well to visual and geometric variations in target objects and to varying interaction forces.

标题: Memory-centered and Affordance-based Framework for Mobile Manipulation

作者: Christoph Pohl, Fabian Reister, Fabian Peller-Konrad

PubTime: 2024-01-30

Downlink: http://arxiv.org/abs/2401.16899v1

中文摘要: 在以人为中心的环境中执行多功能移动操作动作需要高度复杂的软件框架，这些框架足够灵活以处理特殊用例，但又足够通用以适用于不同的机器人系统、任务和环境。本文提出了一个全面的以记忆为中心、基于启示的模块化单手和多手抓取和移动操纵框架，适用于仿人机器人等具有高自由度的复杂机器人系统。通过启示表示移动操纵动作，即机器人与其环境的交互可能性，我们统一了任意环境中已知和未知物体的自主操纵过程。我们的框架被集成并嵌入到ARMAR人形机器人家族以记忆为中心的认知架构中。通过这种方式，机器人不仅可以与物理世界互动，还可以使用关于物体的常识，学习和调整操纵策略。我们在真实世界的实验中证明了该框架的适用性，包括在两个不同的人形机器人平台上抓取已知和未知的物体，放置物体，以及半自动双手抓取物体。

摘要: Performing versatile mobile manipulation actions in human-centered environments requires highly sophisticated software frameworks that are flexible enough to handle special use cases, yet general enough to be applicable across different robotic systems, tasks, and environments. This paper presents a comprehensive memory-centered, affordance-based, and modular uni- and multi-manual grasping and mobile manipulation framework, applicable to complex robot systems with a high number of degrees of freedom such as humanoid robots. By representing mobile manipulation actions through affordances, i.e., interaction possibilities of the robot with its environment, we unify the autonomous manipulation process for known and unknown objects in arbitrary environments. Our framework is integrated and embedded into the memory-centric cognitive architecture of the ARMAR humanoid robot family. This way, robots can not only interact with the physical world but also use common knowledge about objects, and learn and adapt manipulation strategies. We demonstrate the applicability of the framework in real-world experiments, including grasping known and unknown objects, object placing, and semi-autonomous bimanual grasping of objects on two different humanoid robot platforms.

标题: Excitation Trajectory Optimization for Dynamic Parameter Identification Using Virtual Constraints in Hands-on Robotic System

作者: Huanyu Tian, Martin Huber, Christopher E. Mower

PubTime: 2024-01-29

Downlink: http://arxiv.org/abs/2401.16566v1

中文摘要: 本文提出了一种新的、计算效率更高的方法来优化机器人激励轨迹，以进行动态参数识别，强调自碰撞避免。这解决了系统识别的挑战，以获得与可以配备各种工具的共同操纵的机械臂相关联的高质量训练数据，这是工业以及临床和研究环境中的常见场景。利用统一机器人描述格式（URDF）来实现递归牛顿——欧拉算法（RNEA）的符号Python实现，该方法有助于使用对来自真实机器人的数据的回归分析来动态估计参数，例如惯性。与不考虑自碰撞和工具校准的最新报告结果相比，激励轨迹的评估和实现符合par标准。此外，在外科手术环境中进行了物理人机交互(pHRI)导纳控制实验，以评估导出的逆动力学模型，该模型显示NASA TLX问卷显示工作量减少了30.1%。

摘要: This paper proposes a novel, more computationally efficient method for optimizing robot excitation trajectories for dynamic parameter identification, emphasizing self-collision avoidance. This addresses the system identification challenges for getting high-quality training data associated with co-manipulated robotic arms that can be equipped with a variety of tools, a common scenario in industrial but also clinical and research contexts. Utilizing the Unified Robotics Description Format (URDF) to implement a symbolic Python implementation of the Recursive Newton-Euler Algorithm (RNEA), the approach aids in dynamically estimating parameters such as inertia using regression analyses on data from real robots. The excitation trajectory was evaluated and achieved on par criteria when compared to state-of-the-art reported results which didn’t consider self-collision and tool calibrations. Furthermore, physical Human-Robot Interaction (pHRI) admittance control experiments were conducted in a surgical context to evaluate the derived inverse dynamics model showing a 30.1% workload reduction by the NASA TLX questionnaire.

标题: Security Considerations in AI-Robotics: A Survey of Current Methods, Challenges, and Opportunities

作者: Subash Neupane, Shaswata Mitra, Ivan A. Fernandez

PubTime: 2024-01-26

Downlink: http://arxiv.org/abs/2310.08565v3

中文摘要: 机器人和人工智能(AI)从一开始就密不可分地交织在一起。今天，人工智能机器人系统已经成为我们日常生活中不可或缺的一部分，从机器人吸尘器到半自动汽车。这些系统建立在三个基本架构元素之上：感知、导航和规划以及控制。然而，尽管人工智能机器人系统的集成提高了我们的生活质量，但它也带来了一个严重的问题——这些系统容易受到安全攻击。构成人工智能机器人系统的物理组件、算法和数据可能会被恶意行为者利用，潜在地导致可怕的后果。出于解决人工智能机器人系统中安全问题的需要，本文提出了一个跨三个维度的全面调查和分类：攻击面，伦理和法律问题，以及人机交互（HRI）安全。我们的目标是为用户、开发者和其他利益相关者提供对这些领域的整体理解，以增强整体人工智能机器人系统的安全性。我们从调查潜在的攻击面开始，并提供缓解防御策略。然后，我们深入研究伦理问题，如依赖性和心理影响，以及关于这些系统问责制的法律问题。此外，还讨论了HRI等新兴趋势，考虑了隐私、完整性、安全性、可信度和可解释性问题。最后，我们提出了我们对这一充满活力和前景的领域的未来研究方向的展望。

摘要: Robotics and Artificial Intelligence (AI) have been inextricably intertwined since their inception. Today, AI-Robotics systems have become an integral part of our daily lives, from robotic vacuum cleaners to semi-autonomous cars. These systems are built upon three fundamental architectural elements: perception, navigation and planning, and control. However, while the integration of AI-Robotics systems has enhanced the quality our lives, it has also presented a serious problem - these systems are vulnerable to security attacks. The physical components, algorithms, and data that make up AI-Robotics systems can be exploited by malicious actors, potentially leading to dire consequences. Motivated by the need to address the security concerns in AI-Robotics systems, this paper presents a comprehensive survey and taxonomy across three dimensions: attack surfaces, ethical and legal concerns, and Human-Robot Interaction (HRI) security. Our goal is to provide users, developers and other stakeholders with a holistic understanding of these areas to enhance the overall AI-Robotics system security. We begin by surveying potential attack surfaces and provide mitigating defensive strategies. We then delve into ethical issues, such as dependency and psychological impact, as well as the legal concerns regarding accountability for these systems. Besides, emerging trends such as HRI are discussed, considering privacy, integrity, safety, trustworthiness, and explainability concerns. Finally, we present our vision for future research directions in this dynamic and promising field.

== Object Detection==

标题: YOLO-World: Real-Time Open-Vocabulary Object Detection

作者: Tianheng Cheng, Lin Song, Yixiao Ge

PubTime: 2024-01-30

Downlink: http://arxiv.org/abs/2401.17270v1

GitHub: https://github.com/AILab-CVC/YOLO-World|

中文摘要: You Only Look Once（YOLO）系列探测器已经成为高效实用的工具。然而，它们对预定义和训练过的对象类别的依赖限制了它们在开放场景中的适用性。针对这一限制，我们引入了YOLO世界，这是一种创新的方法，通过视觉语言建模和大规模数据集的预训练，增强了YOLO的开放词汇检测能力。具体来说，我们提出了一种新的可重新参数化的视觉——语言路径聚合网络（RepVL-PAN）和区域——文本对比丢失，以促进视觉和语言信息之间的交互。我们的方法擅长以零镜头的方式高效率地检测大范围的物体。在具有挑战性的LVIS数据集上，YOLO世界在V100上以52.0 FPS实现了35.4 AP，在准确性和速度方面都超过了许多最先进的方法。此外，经过微调的YOLO世界在几个下游任务上取得了显著的性能，包括对象检测和开放词汇实例分割。

摘要: The You Only Look Once (YOLO) series of detectors have established themselves as efficient and practical tools. However, their reliance on predefined and trained object categories limits their applicability in open scenarios. Addressing this limitation, we introduce YOLO-World, an innovative approach that enhances YOLO with open-vocabulary detection capabilities through vision-language modeling and pre-training on large-scale datasets. Specifically, we propose a new Re-parameterizable Vision-Language Path Aggregation Network (RepVL-PAN) and region-text contrastive loss to facilitate the interaction between visual and linguistic information. Our method excels in detecting a wide range of objects in a zero-shot manner with high efficiency. On the challenging LVIS dataset, YOLO-World achieves 35.4 AP with 52.0 FPS on V100, which outperforms many state-of-the-art methods in terms of both accuracy and speed. Furthermore, the fine-tuned YOLO-World achieves remarkable performance on several downstream tasks, including object detection and open-vocabulary instance segmentation.

标题: H-SynEx: Using synthetic images and ultra-high resolution ex vivo MRI for hypothalamus subregion segmentation

作者: Livia Rodrigues, Martina Bocchetta, Oula Puonti

PubTime: 2024-01-30

Downlink: http://arxiv.org/abs/2401.17104v1

GitHub: https://github.com/liviamarodrigues/hsynex|

中文摘要: 目的：开发一种由超高分辨率离体磁共振图像（MRI）提供信息的下丘脑亚区自动分割方法，该方法无需重新训练即可跨MRI序列和分辨率进行推广。材料和方法：我们用从超高分辨率离体MRI扫描构建的标签图中获得的合成图像训练了我们的深度学习方法H-synEx，与1毫米等距体内图像相比，这使得能够进行更细粒度的手动分割。我们使用来自六个数据集和六个MRI序列的1535幅体内图像验证了这项回顾性研究。定量评估采用戴斯系数(DC)和平均豪斯多夫距离(AVD)。统计分析使用曲线下面积（AUC）和Wilcoxon秩和检验比较了对照组、阿尔茨海默病（AD）和行为变异额颞叶痴呆（bvFTD）受试者的下丘脑亚区体积。结果：H-SynEx可以在各种MRI序列中分割下丘脑，包括具有显著切片间距（5mm）的FLAIR序列。使用T1w图像上的下丘脑体积来区分对照组与AD和bvFTD患者，我们观察到AUC值分别为0.74和0.79。此外，当比较对照组和非患者时，发现FLAIR扫描的体积变化AUC=0.66。结论：我们的结果表明，H-SynEx成功地利用了来自超高分辨率扫描的信息，从不同的MRI序列（如T1w、T2w、PD、qT1、FA和FLAIR）中进行体内分割。我们还发现，我们的自动分割能够在间距为5毫米的FLAIR图像上区分对照组和患者。H-SynEx可在https://github.com/liviamarodrigues/hsynex。

摘要: Purpose: To develop a method for automated segmentation of hypothalamus subregions informed by ultra-high resolution ex vivo magnetic resonance images (MRI), which generalizes across MRI sequences and resolutions without retraining. Materials and Methods: We trained our deep learning method, H-synEx, with synthetic images derived from label maps built from ultra-high resolution ex vivo MRI scans, which enables finer-grained manual segmentation when compared with 1mm isometric in vivo images. We validated this retrospective study using 1535 in vivo images from six datasets and six MRI sequences. The quantitative evaluation used the Dice Coefficient (DC) and Average Hausdorff distance (AVD). Statistical analysis compared hypothalamic subregion volumes in controls, Alzheimer’s disease (AD), and behavioral variant frontotemporal dementia (bvFTD) subjects using the area under the curve (AUC) and Wilcoxon rank sum test. Results: H-SynEx can segment the hypothalamus across various MRI sequences, encompassing FLAIR sequences with significant slice spacing (5mm). Using hypothalamic volumes on T1w images to distinguish control from AD and bvFTD patients, we observed AUC values of 0.74 and 0.79 respectively. Additionally, AUC=0.66 was found for volume variation on FLAIR scans when comparing control and non-patients. Conclusion: Our results show that H-SynEx successfully leverages information from ultra-high resolution scans to segment in vivo from different MRI sequences such as T1w, T2w, PD, qT1, FA, and FLAIR. We also found that our automated segmentation was able to discriminate controls versus patients on FLAIR images with 5mm spacing. H-SynEx is openly available at https://github.com/liviamarodrigues/hsynex.

标题: MF-MOS: A Motion-Focused Model for Moving Object Segmentation

作者: Jintao Cheng, Kang Zeng, Zhuoxu Huang

PubTime: 2024-01-30

Downlink: http://arxiv.org/abs/2401.17023v1

GitHub: https://github.com/SCNU-RISLAB/MF-MOS|

中文摘要: 移动对象分割（MOS）为检测交通参与者提供了可靠的解决方案，因此在自动驾驶领域具有重要意义。动态捕获在MOS问题中总是至关重要的。以前的方法直接从距离图像中捕获运动特征。不同地，我们认为残差图为运动信息提供了更大的潜力，而距离图像包含丰富的语义指导。基于这种直觉，我们提出了一种新的双分支结构的运动聚焦模型MF-MOS，用于激光雷达运动目标分割。新颖的是，我们通过从残差图中捕捉运动并从距离图像中生成语义特征来解耦时空信息，这些语义特征被用作运动分支的可移动对象引导。我们简单而独特的解决方案可以充分利用距离图像和残差图，从而大大提高基于激光雷达的MOS任务的性能。值得注意的是，我们的MF-MOS在提交后在SemanticKITTI数据集的MOS排行榜上获得了76.7%的领先IoU，展示了当前最先进的性能。我们的MF-MOS的实现已经在https：//github.com/SCNU-RISLAB/MF-MOS上发布。

摘要: Moving object segmentation (MOS) provides a reliable solution for detecting traffic participants and thus is of great interest in the autonomous driving field. Dynamic capture is always critical in the MOS problem. Previous methods capture motion features from the range images directly. Differently, we argue that the residual maps provide greater potential for motion information, while range images contain rich semantic guidance. Based on this intuition, we propose MF-MOS, a novel motion-focused model with a dual-branch structure for LiDAR moving object segmentation. Novelly, we decouple the spatial-temporal information by capturing the motion from residual maps and generating semantic features from range images, which are used as movable object guidance for the motion branch. Our straightforward yet distinctive solution can make the most use of both range images and residual maps, thus greatly improving the performance of the LiDAR-based MOS task. Remarkably, our MF-MOS achieved a leading IoU of 76.7% on the MOS leaderboard of the SemanticKITTI dataset upon submission, demonstrating the current state-of-the-art performance. The implementation of our MF-MOS has been released at https://github.com/SCNU-RISLAB/MF-MOS.

标题: CoSSegGaussians: Compact and Swift Scene Segmenting 3D Gaussians with Dual Feature Fusion

作者: Bin Dou, Tianyu Zhang, Yongjia Ma

PubTime: 2024-01-30

Downlink: http://arxiv.org/abs/2401.05925v3

Project: https://David-Dou.github.io/CoSSegGaussians|

中文摘要: 我们提出了紧凑和快速的分割3D高斯（CoSSegGaussians），这是一种仅使用RGB图像输入以快速渲染速度进行紧凑3D一致场景分割的方法。以前基于NeRF的分割方法依赖于耗时的神经场景优化。虽然最近的3D高斯飞溅显著提高了速度，但现有的基于高斯的分割方法很难产生紧凑的掩模，特别是在零镜头分割中。这个问题可能源于他们直接将可学习参数分配给每个高斯，导致对交叉视图不一致的2D机器生成的标签缺乏鲁棒性。我们的方法旨在通过使用双特征融合网络作为高斯域来解决这个问题。具体来说，我们首先在RGB监督下优化3D高斯。在高斯定位之后，通过显式反投影应用从图像中提取的DINO特征，这些特征进一步与来自高效点云处理网络的空间特征相结合。特征聚合用于将它们融合在全局到局部的策略中，以实现紧凑的分割特征。实验结果表明，与基于NeRF的方法相比，我们的模型在语义和全景零镜头分割任务上都优于基线，同时消耗不到10%的推理时间。代码和更多结果将在https://David-Dou.github.io/CoSSegGaussians

摘要: We propose Compact and Swift Segmenting 3D Gaussians(CoSSegGaussians), a method for compact 3D-consistent scene segmentation at fast rendering speed with only RGB images input. Previous NeRF-based segmentation methods have relied on time-consuming neural scene optimization. While recent 3D Gaussian Splatting has notably improved speed, existing Gaussian-based segmentation methods struggle to produce compact masks, especially in zero-shot segmentation. This issue probably stems from their straightforward assignment of learnable parameters to each Gaussian, resulting in a lack of robustness against cross-view inconsistent 2D machine-generated labels. Our method aims to address this problem by employing Dual Feature Fusion Network as Gaussians’ segmentation field. Specifically, we first optimize 3D Gaussians under RGB supervision. After Gaussian Locating, DINO features extracted from images are applied through explicit unprojection, which are further incorporated with spatial features from the efficient point cloud processing network. Feature aggregation is utilized to fuse them in a global-to-local strategy for compact segmentation features. Experimental results show that our model outperforms baselines on both semantic and panoptic zero-shot segmentation task, meanwhile consumes less than 10% inference time compared to NeRF-based methods. Code and more results will be available at https://David-Dou.github.io/CoSSegGaussians

标题: Fourier Prompt Tuning for Modality-Incomplete Scene Segmentation

作者: Ruiping Liu, Jiaming Zhang, Kunyu Peng

PubTime: 2024-01-30

Downlink: http://arxiv.org/abs/2401.16923v1

GitHub: https://github.com/RuipingL/MISS|https://github.com/RuipingL/MISS|

中文摘要: 整合来自多个模态的信息增强了自动驾驶汽车中场景感知系统的鲁棒性，提供了更全面、更可靠的感知框架。然而，多模态分割中的模态不完全性仍未得到充分探索。在这项工作中，我们建立了一个称为模态不完全场景分割（MISS）的任务，它包括系统级模态缺失和传感器级模态误差。为了避免多模态融合中主要的模态依赖，我们引入了一种缺失感知模态切换（MMS）策略来主动管理训练期间的缺失模态。利用位级批量采样增强了模型在完全和不完全测试场景中的性能。此外，我们引入傅立叶提示调谐（FPT）方法，将代表性的光谱信息合并到有限数量的可学习提示中，以保持对所有未命中场景的鲁棒性。类似于微调效果，但可调参数较少（1.1%）。大量的实验证明了我们提出的方法的有效性，显示了在模态缺失方面比先前最先进的参数有效方法提高了5.84%的mIoU。源代码将在https：//github.com/RuipingL/MISS。

摘要: Integrating information from multiple modalities enhances the robustness of scene perception systems in autonomous vehicles, providing a more comprehensive and reliable sensory framework. However, the modality incompleteness in multi-modal segmentation remains under-explored. In this work, we establish a task called Modality-Incomplete Scene Segmentation (MISS), which encompasses both system-level modality absence and sensor-level modality errors. To avoid the predominant modality reliance in multi-modal fusion, we introduce a Missing-aware Modal Switch (MMS) strategy to proactively manage missing modalities during training. Utilizing bit-level batch-wise sampling enhances the model’s performance in both complete and incomplete testing scenarios. Furthermore, we introduce the Fourier Prompt Tuning (FPT) method to incorporate representative spectral information into a limited number of learnable prompts that maintain robustness against all MISS scenarios. Akin to fine-tuning effects but with fewer tunable parameters (1.1%). Extensive experiments prove the efficacy of our proposed approach, showcasing an improvement of 5.84% mIoU over the prior state-of-the-art parameter-efficient methods in modality missing. The source code will be publicly available at https://github.com/RuipingL/MISS.

标题: Pixel-Wise Recognition for Holistic Surgical Scene Understanding

作者: Nicolás Ayobi, Santiago Rodríguez, Alejandra Pérez

PubTime: 2024-01-26

Downlink: http://arxiv.org/abs/2401.11174v2

Project: https://link.springer.com/chapter/10.1007/978-3-031-16449-1_42|https://ieeexplore.ieee.org/document/10230819|

GitHub: https://github.com/BCV-Uniandes/GraSP|

摘要: This paper presents the Holistic and Multi-Granular Surgical Scene Understanding of Prostatectomies (GraSP) dataset, a curated benchmark that models surgical scene understanding as a hierarchy of complementary tasks with varying levels of granularity. Our approach enables a multi-level comprehension of surgical activities, encompassing long-term tasks such as surgical phases and steps recognition and short-term tasks including surgical instrument segmentation and atomic visual actions detection. To exploit our proposed benchmark, we introduce the Transformers for Actions, Phases, Steps, and Instrument Segmentation (TAPIS) model, a general architecture that combines a global video feature extractor with localized region proposals from an instrument segmentation model to tackle the multi-granularity of our benchmark. Through extensive experimentation, we demonstrate the impact of including segmentation annotations in short-term recognition tasks, highlight the varying granularity requirements of each task, and establish TAPIS’s superiority over previously proposed baselines and conventional CNN-based models. Additionally, we validate the robustness of our method across multiple public benchmarks, confirming the reliability and applicability of our dataset. This work represents a significant step forward in Endoscopic Vision, offering a novel and comprehensive framework for future research towards a holistic understanding of surgical procedures.

专属领域论文订阅

关注{晓理紫|小李子}，每日更新论文，如感兴趣，请转发给有需要的同学，谢谢支持。谢谢提供建议

如果你感觉对你有所帮助，请关注我，每日准时为你推送最新论文

为了答谢各位网友的支持，从今日起免费为300名读者提供订阅主题论文服务，只需VX关注公号并回复{邮箱+论文主题}（如：[email protected] + chatgpt@large language model @LLM）,主题必须是同一个领域，最多三个关键词。解释权归博主所有

你可能感兴趣的:(每日论文,学习,机器人,人工智能)

【Python】一文详细介绍 py格式文件高斯小哥 Python基础【高质量合集】python 新手入门学习
【Python】一文详细介绍py格式文件个人主页：高斯小哥高质量专栏：Matplotlib之旅：零基础精通数据可视化、Python基础【高质量合集】、PyTorch零基础入门教程希望得到您的订阅和支持~创作高质量博文(平均质量分92+)，分享更多关于深度学习、PyTorch、Python领域的优质内容！（希望得到您的关注~）文章目录一、py格式文件简介二、如何创建和编辑py格式文件三、如何运行py
大学播音主持都学什么内容？播音主持专业学什么？配音新手圈
有些喜欢播音主持并且犹豫要不要报考这个大学专业的小伙伴们就会想要了解大学播音主持都学什么内容吧，毕竟如果不够了解就直接选择这个专业真的等选择完进去学习以后才知道这个专业并不是自己想要学习的东西那就来不及了。下面是小编为大家整理出来的一些播音主持专业学习的内容，请往下看吧。大学播音主持专业主要学习的课程有：播音发声、播音创作基础、广播播音主持、电视播音主持、文艺作品演播学概论、新闻学概论、新闻采编、
新网师的精神肤色（幕布笔记）悦读书香
王子老师的《极简100小妙招》收到已经几天了，之前大概的浏览了全书，今天起给自己定了一个计划，必须每天学习极简小妙招里面的一个妙招，并加以运用。一、今天要打卡什么内容因有完成每天学习极简小妙招的计划，所以今天晚饭吃的比较简单，草草吃完以后带着小宝到广场溜达一圈，急忙赶回来学习极简小妙招。再重看的时候不知道自己要学点什么，打卡哪一招，感觉哪个都简单，就看这一环节像王子老师说的“一看就会”，但做这一环
学习JavaEE的日子 Day32 线程池 A 北枝学习JavaEE 学习 java-ee java 线程池
Day32线程池1.引入一个线程完成一项任务所需时间为：创建线程时间-Time1线程中执行任务的时间-Time2销毁线程时间-Time32.为什么需要线程池(重要)线程池技术正是关注如何缩短或调整Time1和Time3的时间，从而提高程序的性能。项目中可以把Time1，T3分别安排在项目的启动和结束的时间段或者一些空闲的时间段线程池不仅调整Time1，Time3产生的时间段，而且它还显著减少了创建
没有如释重负君远近
虽然只有短短的一个多月的努力复习时间，但今天的整个考试经过，还是发现了效果的，题目做的比较自如，没有慌里慌张，而且提前五分钟完成。至于考试成绩，没有实足的把握，60分都不敢保证。但绝对相信自己，比去年肯定要好！今天早早的赶到考场，见到了刘老师，谈起来学习情况，坦率的说，真的是自己不够重视。总以为会很难，没有信心。其实不是的，只要认真对待，树立足够的信心，绝对可以通过考试的。还向老师询问了，后续再报
C++学习笔记（lambda函数） __TAT__ C&C++c++学习笔记
C++learningnote1、lambda函数的语法2、lambda函数的几种用法1、lambda函数的语法lambda函数的一般语法如下：[capture_clause](parameters)->return_type{function_body}capture_clause：需要捕获的变量，但要求该变量必须在这个作用域中。通常的捕获方式有以下几种：[]：不捕获任何变量[&]：按引用捕获变
心赏（2018.10.8）六一节_3928
1.上班第一天，同事彤休完产假，回来上班，给我带了酸奶和水果。她生小孩时，我给她发了一个小红包贺喜，哪知她就记在心里了。心赏这个有心的90后。2.女儿放学回来，说自己当了小组长。一边说不想当，一边得意的样子。心赏老师给了孩子这个锻炼的机会。3.老妈今天做了"蚂蚁上树"的菜，得到女儿的高度肯定。心赏老妈还在不断学习。
2021.12.13 自律日记夏舒帅然a
深感时光转瞬即逝，如指缝流金！自律、习惯养成、执行力提高迫在眉睫！今天是什么日子：平日艳阳天起床：7：50任务清单（明日）1.起床：7:302.就寝：10：103.读书30分钟4.打两套太极5.两次静坐（每次15分钟）昨日完成的任务情况，最重要的三件事一.读书30分。未完成二.就寝10：00完成三.起床7：30未完成四.打两套太极未完成五.两次静坐（每次15分钟）未完成习惯养成：早睡早起、每日读书
ChatGPT一路狂飙？何鲸洛
2月2日。根据投行瑞银集团在周三发布的一份研究报告。爆红聊天机器人ChatGPT的月活跃用户在今年1月份预计达到了1亿，这距离它推出只有2个月时间，成为史上增长最快的消费者应用。①ChatGPT一路火花带闪电？▽2014年。OpenAI创始人SamAltman早年曾执掌著名的硅谷孵化器YCombinator。2015年。Altman联合马斯克、彼得·泰尔、AWS、印度Infosys和YC等作为出资
2022-2-13晨间日记越亮也打烊
今天是什么日子起床：7:00就寝：12:08天气：晴心情：糟糕纪念日：无任务清单昨日完成的任务，最重要的三件事：寒假作业，网课，画画改进：作业时间剪短习惯养成：网课不逃～周目标·完成进度数学卷子100％学习·信息·阅读《傅雷家书》《钢铁是怎样炼成的》健康·饮食·锻炼我终于不喝饮料啦，喝茶～人际·家人·朋友邝姐姐带我吃火锅工作·思考啥时候开学，我还有几天赶完作业最美好的三件事1.卷子写完了2.我有冰
中原焦点团队38期王芳芳坚持分享第236天，20230630总约练134次，来访113次，咨8次，观察员13次芳芳王
学习焦点的初心是想拯救孩子，孩子由于沉迷游戏，成绩下滑，在学习的过程中发现是自己的教育方式出了状况。经过半年的学习，一些焦点的基本技巧，如接纳、欣赏、倾听、同理心、尊重等都有了一定的了解。但在实际应用时仍然存在很多问题，感觉自己仍然没有放下对孩子成绩的期望，仍然把握不住对孩子管理的度。我该如何去陪伴好孩子？多用心去听课，并加强反思，多约练。去思考如何让自己快乐起来？
大创项目推荐深度学习 opencv python 公式识别(图像识别机器视觉) laafeer python
文章目录0前言1课题说明2效果展示3具体实现4关键代码实现5算法综合效果6最后0前言优质竞赛项目系列，今天要分享的是基于深度学习的数学公式识别算法实现该项目较为新颖，适合作为竞赛课题方向，学长非常推荐！学长这里给一个题目综合评分(每项满分5分)难度系数：3分工作量：4分创新点：4分更多资料,项目分享：https://gitee.com/dancheng-senior/postgraduate1课题
#D174-读书会作业-《财务自由之路》3 白洲笔记
最近沉迷于写作营，一直就没时间去弄读书会的作业，书的第二遍也就看了个开头，趁着日更的时间，赶紧把作业做了，这次是15到21课。【1.印象最深刻的部分】(本周所读内容中印象最深刻的部分)*活在未来，最正确的方法是什么？用正确的方法做正确的事情，判断什么是正确的？逻辑。学会思考。"作对事情"永远比“把事情作对“重要的多。”长远思考，耐心验证，小心总结提炼“证明自己正确并不是学习的任务和目标，时刻成长，
账务处理又出错？资深会计来教你，学会效率翻倍！共同学习小橘子要努力吖
作为一名会计，在实际工作中会遇到各种麻烦的账务处理问题。那么，最常用的会计处理方法都有哪些呢？今天小编为大家带来了从业二十六年的资深老会计分享的十四中会计常用的账务处理问题的解决方案，快来看看吧！一、促销品的账务处理在促销时公司经常会把一些商品按进价赠送给消费者使用二、款已付清但发票未到的账务处理三、购买材料发生不合理损耗的账务处理问题公司在购买材料时，常常会发生一些不合理的损耗，那么这种问题该怎
【真诚子】通晓鬼谷第七篇读书日记。真诚子l通晓鬼谷
今天把个人品牌，从193读到208页，书的内容质量出奇的高，尤其是这一段。对标学习法，找一个比自己强，或者你期望成为的人进行模仿性学习，对标学习，不是到处，去找人对标兵学习很多人的优点，或是学习自己认为好的方面，而是找准一个对标高手，然后全方位的学习这个人。我在做品牌咨询时就对标，学习了一个在国内很有名的行业顶尖大咖。我先找到他公司的方案，进行完全模仿，连PPT的排版都一样，而且我只参照他一个人的
ES-LTR粗排模块 poins jenkins 运维
ES-LTR粗排模块官方资源：https://github.com/HeiBoWang/elasticsearch-learning-to-rankElasticsearch学习排名插件使用机器学习提高搜索相关性排名。它为维基媒体基金会和Snagajob等地方的搜索提供了动力！这个插件有什么功能此插件：允许您在Elasticsearch中存储特征（Elasticsearch查询模板）记录特征得分（
2018-11-18成长小组学习笔记实验中学45
因为嗓子“罢工”，我面对众人只能借“微笑”代言。在开始授课前，绣霞老师先反馈上次作业的情况，提到“接纳”需是真正发自内心的完全接纳，而不是口头上的接纳，内心却是排斥的。提到一个“问题”孩子恰恰对家爱的更加“深沉”，夫妻间的问题不能影响到孩子，对孩子更好的爱不是你为他做的更多，而是给他自由、健康成长的空间。图片发自App一、孩子：家庭的一面镜子夫妻成了彼此的“投射”，婚姻便“吵的不可开交”，婚姻便成
Ai插件脚本合集安装包，免费教程视频网盘分享全网优惠分享君
随着人工智能技术的不断发展，越来越多的插件脚本涌现出来，为我们的生活和工作带来了便利。然而，如何快速、方便地获取和使用这些插件脚本呢？今天，我将为大家分享一个非常实用的资源——AI插件脚本合集安装包，以及免费教程视频网盘分享。首先，让我们来了解一下这个AI插件脚本合集安装包。它是一个集合了众多AI插件脚本的资源包，涵盖了各种领域，如数据分析、自动化办公、智能客服等等。通过这个安装包，用户可以轻松地
过去一年，这16本好书不容错过 m0_54050778 perl
编者按：2023年在动荡与希望中收尾，2023年注定会被载入史册。疫情寒冬结束，ChatGPT横空出世，带动了人工智能技术的飞速发展；淄博烧烤、天津大爷、尔滨之旅等充满感动与幸福。但与此同时，2023年又是动荡与不安的一年，俄乌冲突的延宕，新一轮的巴以冲突，极端天气频发。在这个大环境下，有一些经典的书籍著作诞生。本文将分享2023年最值得一读的16本书籍，文章来自翻译，希望对你有所启示。关于202
2019-07-16 振华老凤祥店长崔宁宁
大爱的李老师，智慧的教授，亲爱的跃友们：大家好！我是莱州鑫和金店李总的人～崔宁宁今天是我的日精进行动第56天，我分享一下今天的改变，我们相互勉励，每天进步一点点，离成功便不远。1、比学习：人这一生最主要的就是信念，坚定不移的信念是成功路上的重要基石！2、比改变：我是一切的根源，我变了世界就变了！改变自己的心态！3、比付出：承担才能成长，付出才会杰出！4、比谦卑：学习每位优秀店长身上的优点！5、比感
python清华大学出版社答案_Python机器学习及实践 weixin_39805119 python清华大学出版社答案
第1章机器学习的基础知识1.1何谓机器学习1.1.1传感器和海量数据1.1.2机器学习的重要性1.1.3机器学习的表现1.1.4机器学习的主要任务1.1.5选择合适的算法1.1.6机器学习程序的步骤1.2综合分类1.3推荐系统和深度学习1.3.1推荐系统1.3.2深度学习1.4何为Python1.4.1使用Python软件的由来1.4.2为什么使用Python1.4.3Python设计定位1.4.
2018-12-02 子分小
姓名：张颖公司：菲尔德国际英语【反省总结第146天，始于20180709今天是20181202】【知～学习】六项精进大纲背诵3遍每天十个单词坚持第181天每天学习一篇英文文章第94天英语流利说课程第71天学习30分钟【行～实践】一、修身：（对自己个人）步行5000步二、齐家：（对家庭和家人）无三、建功：（对工作)完成与Arti活动课和两节Demo准备开班事宜｛积善｝：发愿从2018年7月9日起1年
如何成为思维的高手？明安包装闫慧玲
六项精进训练营Day2复盘20210112湖北荆州学习靠氛围，成长靠圈子1.关于金钱认知金句：1.当今世界，非钱不行2.有钱能使鬼推磨3.金钱是万恶之本4.时间就是金钱5.金钱不是万能的，但是没有钱是万万不能的6.谈钱伤感情，谈感情伤钱道德系统→好人→美德→回流利益系统→好好生活天下熙熙皆为利来，天下攘攘皆为利往出自西汉著名史学家、文学家司马迁《史记》的第一百二十九章“货殖列传”。这句话意思是说天
十分钟自由写作知意zy
主题：我缺乏的东西自从加入2022年弘丹写作学院，感觉每天的生活都忙碌了起来，我要上班，要学习。所以我每天都必须拼尽全力向前奔跑，才追得上小伙伴们的脚步。在写作学院，我学会了反省自己的不足，我的想法多，缺乏的东西也太多。比如：写作的文笔，写作逻辑，底层自信心……看到社群里那么多优秀的小伙伴，我感觉自己越来越自卑，我这么一个平庸的人，会完成今年的写作目标吗？我开始不停怀疑自己是否能坚持下去。而弘丹老
2021-04-11 英英成长日记
（1）每天写50字以上的催眠语言肯定自己或孩子或爱人今天的公益沙龙第二期，你有充分的准备！所以一切都很顺利！你还可以更灵活，我相信你可以做到！你是一个有爱的人！爱能成就一切！加油！分享也是成长！你说对吗？（2）每天晚上跟潜意识沟通一次。谢谢你潜意识，今天支持我讲完两个小时沙龙！感恩你每天这样支持我成长学习！（3）每天学习三条时间管理方法，共100条。(4)自己想要坚持3件事（确定下来至少一件，坚持
忙忙碌碌才是生活北渔说
观海年后上班，因为项目接近尾声甚是消闲。说是消闲，其实身消闲，心不消闲。都说当下社会是焦虑的社会，因为人们普遍焦虑。上班已有半月，想想这好像是上班几年来最空闲的一段时间了。空闲的主要原因是工作处在了瓶颈期，心有余而力不足。因为有一颗力求完美的心，但却没有力求完美的能力，所以徒有焦虑。不知道大家有没有这种感觉，在高压学习或工作一段时间之后，突然闲下来就会茫然无措。有时候读一本长篇，好不容易结束本来应
数据管理知识体系指南（第二版）-第五章——数据建模和设计-学习笔记键盘上的五花肉数据治理数据库数据仓库数据治理
目录5.1引言5.1.1业务驱动因素5.1.2目标和原则5.1.3基本概念5.2活动5.2.1规划数据建模5.2.2建立数据模型5.2.3审核数据模型5.2.4维护数据模型5.3工具5.3.1数据建模工具5.3.2数据血缘工具5.3.3数据分析工具5.3.4元数据资料库5.3.5数据模型模式5.3.6行业数据模型5.4方法5.4.1命名约定的最佳实践5.4.2数据库设计中的最佳实践5.5数据建模和
职场人员学习时间管理的重大意义时间管理v8
时间管理是指通过事先规划和运用一定的技巧、方法与工具实现对时间的灵活以及有效运用，从而实现个人或组织的既定目标。职场人员能否在自己的事业生涯中取得成功，秘诀就在于搞好时间管理。世界上最重要的东西是"时间"，不能管理时间，便什么也不能管理。时间是世界上最短缺的资源，除非严加管理，否则就会一事无成。职场人员学习时间管理的重大意义职场中时间陷阱为什么职场人员总是觉得时间不够，经常会导致加班加点的工作？主
遗落的光阴古诗风光
第七篇，小明的学生时代。小明和他的同桌的共听一首歌的行为已经实现了。所以每次没事就和他的同桌一起畅听音乐，这也导致了一些场面都发生，一就是她的隔壁同桌时不时的鄙夷的眼光，二是他进一步加聚了他同桌对他的态度，他的同桌除了平时的听音乐交流之外，还增加了与他的交流。其中最关键的就是，因为他的同桌没事就与他的进行生活的交流。其中最关键的就是在一个不上课的周末小明独自一人回到了宿舍进行学习。而这时他的同桌带
Linux学习系列之vim编辑器（一） llibertyll linux 学习
vi编辑器的操作模式输入模式—aio等—>命令模式<—：键—末行模式从输入/末行模式切换到命令模式都是需要按ESC键注:a光标后输入，i光标前输入，o直接向下加一行输入，O向上加一行输入在vi编辑器中光标的移动（命令行模式下）键组合（命令）光标的移动$光标移动到当前行的结尾0（零）光标移动到当前行的开始GG光标移动到最后一行gg光标移动到第一行在命令行模式下删除与复制的操作键组合（命令）含义dd删
关于旗正规则引擎规则中的上传和下载问题何必如此文件下载压缩 jsp 文件上传
文件的上传下载都是数据流的输入输出，大致流程都是一样的。一、文件打包下载 1.文件写入压缩包 string mainPath="D:\upload\"; 下载路径 string tmpfileName=jar.zip; &n
【Spark九十九】Spark Streaming的batch interval时间内的数据流转源码分析 bit1129 Stream
以如下代码为例（SocketInputDStream）： Spark Streaming从Socket读取数据的代码是在SocketReceiver的receive方法中，撇开异常情况不谈(Receiver有重连机制，restart方法，默认情况下在Receiver挂了之后，间隔两秒钟重新建立Socket连接)，读取到的数据通过调用store(textRead)方法进行存储。数据
spark master web ui 端口8080被占用解决方法 daizj 8080 端口占用 spark master web ui
spark master web ui 默认端口为8080，当系统有其它程序也在使用该接口时，启动master时也不会报错，spark自己会改用其它端口，自动端口号加1，但为了可以控制到指定的端口，我们可以自行设置，修改方法： 1、cd SPARK_HOME/sbin 2、vi start-master.sh 3、定位到下面部分
oracle_执行计划_谓词信息和数据获取周凡杨 oracle 执行计划
oracle_执行计划_谓词信息和数据获取(上) 一：简要说明在查看执行计划的信息中，经常会看到两个谓词filter和access，它们的区别是什么，理解了这两个词对我们解读Oracle的执行计划信息会有所帮助。简单说，执行计划如果显示是access，就表示这个谓词条件的值将会影响数据的访问路径（表还是索引），而filter表示谓词条件的值并不会影响数据访问路径，只起到
spring中datasource配置 g21121 dataSource
datasource配置有很多种，我介绍的一种是采用c3p0的，它的百科地址是： http://baike.baidu.com/view/920062.htm  <bean name="propertiesConfig" class="org.springframework.b
web报表工具FineReport使用中遇到的常见报错及解决办法（三）老A不折腾 finereport FAQ 报表软件
这里写点抛砖引玉，希望大家能把自己整理的问题及解决方法晾出来，Mark一下，利人利己。出现问题先搜一下文档上有没有，再看看度娘有没有，再看看论坛有没有。有报错要看日志。下面简单罗列下常见的问题，大多文档上都有提到的。 1、repeated column width is largerthan paper width：这个看这段话应该是很好理解的。比如做的模板页面宽度只能放
mysql 用户管理墙头上一根草 linux mysql user
1.新建用户 //登录MYSQL@>mysql -u root -p@>密码//创建用户mysql> insert into mysql.user(Host,User,Password) values(‘localhost’,'jeecn’,password(‘jeecn’));//刷新系统权限表mysql>flush privileges;这样就创建了一个名为：
关于使用Spring导致c3p0数据库死锁问题 aijuans spring Spring 入门 Spring 实例 Spring3 Spring 教程
这个问题我实在是为整个 springsource 的员工蒙羞如果大家使用 spring 控制事务，使用 Open Session In View 模式， com.mchange.v2.resourcepool.TimeoutException: A client timed out while waiting to acquire a resource from com.mchange.
百度词库联想 annan211 百度
<!DOCTYPE html> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <title>RunJS</title&g
int数据与byte之间的相互转换实现代码百合不是茶位移 int转byte byte转int 基本数据类型的实现
在BMP文件和文件压缩时需要用到的int与byte转换,现将理解的贴出来; 主要是要理解;位移等概念 http://baihe747.iteye.com/blog/2078029 int转byte; byte转int; /** * 字节转成int,int转成字节 * @author Administrator *
简单模拟实现数据库连接池 bijian1013 java thread java多线程简单模拟实现数据库连接池
简单模拟实现数据库连接池实例1： package com.bijian.thread; public class DB { //private static final int MAX_COUNT = 10; private static final DB instance = new DB(); private int count = 0; private i
一种基于Weblogic容器的鉴权设计 bijian1013 java weblogic
服务器对请求的鉴权可以在请求头中加Authorization之类的key，将用户名、密码保存到此key对应的value中，当然对于用户名、密码这种高机密的信息，应该对其进行加砂加密等，最简单的方法如下： String vuser_id = "weblogic"; String vuse
【RPC框架Hessian二】Hessian 对象序列化和反序列化 bit1129 hessian
任何一个对象从一个JVM传输到另一个JVM，都要经过序列化为二进制数据(或者字符串等其他格式，比如JSON)，然后在反序列化为Java对象，这最后都是通过二进制的数据在不同的JVM之间传输(一般是通过Socket和二进制的数据传输)，本文定义一个比较符合工作中。 1. 定义三个POJO Person类 package com.tom.hes
【Hadoop十四】Hadoop提供的脚本的功能 bit1129 hadoop
1. hadoop-daemon.sh 1.1 启动HDFS ./hadoop-daemon.sh start namenode ./hadoop-daemon.sh start datanode 通过这种逐步启动的方式，比start-all.sh方式少了一个SecondaryNameNode进程，这不影响Hadoop的使用，其实在 Hadoop2.0中，SecondaryNa
中国互联网走在“灰度”上 ronin47 管理灰度
中国互联网走在“灰度”上（转）文/孕峰第一次听说灰度这个词，是任正非说新型管理者所需要的素质。第二次听说是来自马化腾。似乎其他人包括马云也用不同的语言说过类似的意思。灰度这个词所包含的意义和视野是广远的。要理解这个词，可能同样要用“灰度”的心态。灰度的反面，是规规矩矩，清清楚楚，泾渭分明，严谨条理，是决不妥协，不转弯，认死理。黑白分明不是灰度，像彩虹那样
java-51-输入一个矩阵，按照从外向里以顺时针的顺序依次打印出每一个数字。 bylijinnan java
public class PrintMatrixClockwisely { /** * Q51.输入一个矩阵，按照从外向里以顺时针的顺序依次打印出每一个数字。例如：如果输入如下矩阵： 1 2 3 4 5 6 7 8 9
mongoDB 用户管理开窍的石头 mongoDB用户管理
1:添加用户第一次设置用户需要进入admin数据库下设置超级用户（use admin） db.addUsr({user:'useName',pwd:'111111',roles:[readWrite,dbAdmin]}); 第一个参数用户的名字第二个参数
[游戏与生活]玩暗黑破坏神3的一些问题 comsci 生活
暗黑破坏神3是有史以来最让人激动的游戏。。。。但是有几个问题需要我们注意玩这个游戏的时间，每天不要超过一个小时，且每次玩游戏最好在白天结束游戏之后，最好在太阳下面来晒一下身上的暗黑气息，让自己恢复人的生气 &nb
java 二维数组如何存入数据库 cuiyadll java
using System; using System.Linq; using System.Text; using System.Windows.Forms; using System.Xml; using System.Xml.Serialization; using System.IO; namespace WindowsFormsApplication1 {
本地事务和全局事务Local Transaction and Global Transaction(JTA) darrenzhu java spring local global transaction
Configuring Spring and JTA without full Java EE http://spring.io/blog/2011/08/15/configuring-spring-and-jta-without-full-java-ee/ Spring doc -Transaction Management http://docs.spring.io/spri
Linux命令之alias - 设置命令的别名，让 Linux 命令更简练 dcj3sjt126com linux alias
用途说明设置命令的别名。在linux系统中如果命令太长又不符合用户的习惯，那么我们可以为它指定一个别名。虽然可以为命令建立“链接”解决长文件名的问题，但对于带命令行参数的命令，链接就无能为力了。而指定别名则可以解决此类所有问题【1】。常用别名来简化ssh登录【见示例三】，使长命令变短，使常用的长命令行变短，强制执行命令时询问等。常用参数格式：alias 格式：ali
yii2 restful web服务[格式响应] dcj3sjt126com PHP yii2
响应格式当处理一个 RESTful API 请求时，一个应用程序通常需要如下步骤来处理响应格式：确定可能影响响应格式的各种因素，例如媒介类型，语言，版本，等等。这个过程也被称为 content negotiation。资源对象转换为数组，如在 Resources 部分中所描述的。通过 [[yii\rest\Serializer]]
MongoDB索引调优（2）——[十] eksliang mongodb MongoDB索引优化
转载请出自出处：http://eksliang.iteye.com/blog/2178555 一、概述上一篇文档中也说明了，MongoDB的索引几乎与关系型数据库的索引一模一样，优化关系型数据库的技巧通用适合MongoDB，所有这里只讲MongoDB需要注意的地方二、索引内嵌文档可以在嵌套文档的键上建立索引，方式与正常
当滑动到顶部和底部时，实现Item的分离效果的ListView gundumw100 android
拉动ListView，Item之间的间距会变大，释放后恢复原样； package cn.tangdada.tangbang.widget; import android.annotation.TargetApi; import android.content.Context; import android.content.res.TypedArray; import andr
程序员用HTML5制作的爱心树表白动画 ini JavaScript jquery Web html5 css
体验效果：http://keleyi.com/keleyi/phtml/html5/31.htmHTML代码如下： <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"><head><meta charset="UTF-8" > <ti
预装windows 8 系统GPT模式的ThinkPad T440改装64位 windows 7旗舰版 kakajw ThinkPad 预装改装 windows 7 windows 8
该教程具有普遍参考性，特别适用于联想的机器，其他品牌机器的处理过程也大同小异。该教程是个人多次尝试和总结的结果，实用性强，推荐给需要的人！缘由小弟最近入手笔记本ThinkPad T440，但是特别不能习惯笔记本出厂预装的Windows 8系统，而且厂商自作聪明地预装了一堆没用的应用软件，消耗不少的系统资源（本本的内存为4G，系统启动完成时，物理内存占用比
Nginx学习笔记 mcj8089 nginx
一、安装nginx 1、在nginx官方网站下载一个包，下载地址是： http://nginx.org/download/nginx-1.4.2.tar.gz 2、WinSCP(ftp上传工
mongodb 聚合查询每天论坛链接点击次数 qiaolevip 每天进步一点点学习永无止境 mongodb 纵观千象
/* 18 */ { "_id" : ObjectId("5596414cbe4d73a327e50274"), "msgType" : "text", "sendTime" : ISODate("2015-07-03T08:01:16.000Z"
java术语（PO/POJO/VO/BO/DAO/DTO） Luob. DAO POJO DTO po VO BO
PO(persistant object) 持久对象在o/r 映射的时候出现的概念,如果没有o/r映射,就没有这个概念存在了.通常对应数据模型(数据库),本身还有部分业务逻辑的处理.可以看成是与数据库中的表相映射的java对象.最简单的PO就是对应数据库中某个表中的一条记录,多个记录可以用PO的集合.PO中应该不包含任何对数据库的操作. VO(value object) 值对象通
算法复杂度 Wuaner Algorithm
Time Complexity & Big-O： http://stackoverflow.com/questions/487258/plain-english-explanation-of-big-o http://bigocheatsheet.com/ http://www.sitepoint.com/time-complexity-algorithms/