文章主题：关键词：CS.AI, 误差分析, 修正, Weighted A\*次优性

烦请扫描文末二维码，关注一下同步WX订阅号哦，谢谢(￣▽￣)”

cs.AI 方向，今日共计47篇

[cs.AI]

在本篇文章中，我们将深入探讨如何分析和修正加权A*算法的子优化问题（扩展版）。加权A*算法是一种广泛应用于路径规划和搜索问题的优化算法。然而，当问题具有复杂性时，该算法可能会产生 suboptimal 解，从而导致性能下降。为了应对这一问题，本文提出了对加权A*算法的子优化问题进行分析与修正的方法。通过对算法的原理和实现进行详细阐述，我们提出了一种新的分析方法，用以评估算法的性能。此外，我们还针对不同的应用场景提出了相应的修正策略，从而提高了算法的成功率和效率。总之，本文旨在为读者提供一个全面、深入的加权A*算法分析与 correction 的指南，以期推动相关领域的研究与发展。

标题：加权A *次优性的误差分析与修正（扩展版）

作者： Robert C. Holte, Daniel Borrajo

在第12届组合搜索 symposium 上发表的短篇**论文**

摘要： Weighted A* (wA*) is a widely used algorithm for rapidly, but suboptimally,solving planning and search problems. The cost of the solution it produces isguaranteed to be at most W times the optimal solution cost, where W is theweight wA* uses in prioritizing open nodes. W is therefore a suboptimalitybound for the solution produced by wA*. There is broad consensus that thisbound is not very accurate, that the actual suboptimality of wA*s solution isoften much less than W times optimal. However, there is very little publishedevidence supporting that view, and no existing explanation of why W is a poorbound. This paper fills in these gaps in the literature. We begin with alarge-scale experiment demonstrating that, across a wide variety of domains andheuristics for those domains, W is indeed very often far from the truesuboptimality of wA*s solution. We then analytically identify the potentialsources of error. Finally, we present a practical method for correcting for twoof these sources of error and experimentally show that the correctionfrequently eliminates much of the error.

【2】 Soft Options Critic

标题：软选项评论家

作者： Elita Lobo, Scott Jordan

链接：https://arxiv.org/abs/1905.11222摘要： The option-critic paper and several variants have successfully demonstratedthe use of the options framework proposed by Barto et al to scale learning andplanning in hierarchical tasks. Although most of these frameworks use entropyas a regularizer to improve exploration, they do not maximize entropy alongwith returns at every time step. In this paper we investigate the effect ofmaximizing entropy of each options and inter-option policy in optionsframework. We adopt the architecture of the recently introduced soft-actorcritic algorithm to enable learning of robust options in continuous anddiscrete action spaces in a off-policy manner thus also making it sampleefficient. In this paper we derive the soft options improvement theorem andpropose a novel soft-options framework to incorporate maximization of entropyof actions and options in a constrained manner. Our experiments show thatmaximizing entropy of actions and options in a constrained manner with highlearning rate does not harm the main objective of maximizing returns and henceoutperforms vanilla options-critic framework in most hierarchical tasks. Wealso observe faster recovery when the environment is subject to perturbations.

在当前的科技环境中，人工智能（AI）的发展日益迅速，其中一种新兴的人工智能生成算法（AI-GAs）正在逐渐崭露头角。作为一种全新的生成人工智能的方法，AI-GAs的出现，为我们提供了一种全新的思维模式和技术路径，从而有可能在未来推动人工智能领域取得**重大**突破。

标题：AI-GAs：AI生成算法，是生成一般人工智能的替代范例

作者： Jeff Clune

链接：https://arxiv.org/abs/1905.10985摘要： Perhaps the most ambitious scientific quest in human history is the creationof general artificial intelligence, which roughly means AI that is as smart orsmarter than humans. The dominant approach in the machine learning community isto attempt to discover each of the pieces required for intelligence, with theimplicit assumption that some future group will complete the Herculean task offiguring out how to combine all of those pieces into a complex thinkingmachine. I call this the “manual AI approach. This paper describes anotherexciting path that ultimately may be more successful at producing general AI.It is based on the clear trend in machine learning that hand-designed solutionseventually are replaced by more effective, learned solutions. The idea is tocreate an AI-generating algorithm (AI-GA), which automatically learns how toproduce general AI. Three Pillars are essential for the approach: (1)meta-learning architectures, (2) meta-learning the learning algorithmsthemselves, and (3) generating effective learning environments. I argue thateither approach could produce general AI first, and both are scientificallyworthwhile irrespective of which is the fastest path. Because both arepromising, yet the ML community is currently committed to the manual approach,I argue that our community should increase its research investment in the AI-GAapproach. To encourage such research, I describe promising work in each of theThree Pillars. I also discuss AI-GA-specific safety and ethical considerations.Because it it may be the fastest path to general AI and because it isinherently scientifically interesting to understand the conditions in which asimple algorithm can produce general AI (as happened on Earth where Darwinianevolution produced human intelligence), I argue that the pursuit of AI-GAsshould be considered a new grand challenge of computer science research.

在认知诊断领域中，提高项目反应理论的性能是一个重要的研究方向。为了实现这一目标，研究人员不断努力改进项目的开发和评估方法，以便更准确地预测学生的学习成果。在这个背景下，强化项目反应理论在认知诊断中的应用显得尤为重要。通过采用一系列创新性的技术和策略，可以有效提升项目反应理论的预测能力和诊断效果，从而更好地满足认知诊断的实际需求。

标题：增强认知诊断的项目反应理论

作者： Song Cheng, Qi Liu

链接：https://arxiv.org/abs/1905.10957摘要： Cognitive diagnosis is a fundamental and crucial task in many educationalapplications, e.g., computer adaptive test and cognitive assignments. ItemResponse Theory (IRT) is a classical cognitive diagnosis method which canprovide interpretable parameters (i.e., student latent trait, questiondiscrimination, and difficulty) for analyzing student performance. However,traditional IRT ignores the rich information in question texts, cannot diagnoseknowledge concept proficiency, and it is inaccurate to diagnose the parametersfor the questions which only appear several times. To this end, in this paper,we propose a general Deep Item Response Theory (DIRT) framework to enhancetraditional IRT for cognitive diagnosis by exploiting semantic representationfrom question texts with deep learning. In DIRT, we first use a proficiencyvector to represent students proficiency in knowledge concepts and embedquestion texts and knowledge concepts to dense vectors by Word2Vec. Then, wedesign a deep diagnosis module to diagnose parameters in traditional IRT bydeep learning techniques. Finally, with the diagnosed parameters, we input theminto the logistic-like formula of IRT to predict student performance. Extensiveexperimental results on real-world data clearly demonstrate the effectivenessand interpretation power of DIRT framework.

在trick-taking卡牌游戏中，基于政策的推理方法是一种重要的策略。这种方法利用游戏中的规则和玩家行为来推断其他玩家的策略和目标，从而帮助自己制定更有效的决策。基于政策的推理方法可以应用于各种trick-taking卡牌游戏，如桥牌、二十一点等。通过分析游戏中的数据和模式，玩家可以预测其他玩家的行动，并据此调整自己的策略。这种方法不仅可以提高玩家的胜率，还可以增加游戏的趣味性和挑战性。

标题：特技纸牌游戏中的政策推理

作者： Douglas Rebstock, Nathan R. Sturtevant

需要将原文“accepted to IEEE Conference on Games 2019 (CoG-2019)”改写成一篇符合专业文章表达的文章。在2019年，我们的论文”题目”被接受参加了IEEE国际电子游戏大会（CoG-2019）。这是一次非常重要的会议，汇聚了来自世界各地的电子游戏研究专家和学者，共同探讨电子游戏产业的发展趋势和技术创新。我们非常荣幸能够有机会在这次会议上展示我们的研究成果，并与同行们进行交流和合作。这次经历不仅提高了我们的研究水平和技能，也让我们更加深入地了解了电子游戏领域的最新动态和未来发展方向。

链接：https://arxiv.org/abs/1905.10911摘要： Trick-taking卡牌游戏包含大量私人信息，这些信息通过一系列行为逐渐揭示。这使得行为序列长度的时间指数变得非常庞大，同时也创造了极其庞大的信息集。因此，这类游戏变得难以解决。为了应对这些问题，许多算法采用推理方法，即估计信息集中状态的概率。在本文中，我们展示了一种基于策略的推理（PI）算法，该算法利用玩家模型来推断我们所处的特定状态的概率。我们在德国Trick-taking卡牌游戏Skat上进行了实验，结果表明，与先前的工作相比，这种方法在推理方面有了显著的提升，并且将先进的Skat AI系统Kermit纳入确定性搜索算法中，提高了其性能。

【6】 Learning Policies from Human Data for Skat

标题：从Skat的人类数据学习政策

作者： Douglas Rebstock, Michael Buro

这篇文章被接纳并发表在IEEE Conference on Games 2019（CoG-2019）上。

链接：https://arxiv.org/abs/1905.10907摘要： Decision-making in large imperfect information games is difficult. Thanks torecent success in Poker, Counterfactual Regret Minimization (CFR) methods havebeen at the forefront of research in these games. However, most of the successin large games comes with the use of a forward model and powerful stateabstractions. In trick-taking card games like Bridge or Skat, large informationsets and an inability to advance the simulation without fully determinizing thestate make forward search problematic. Furthermore, state abstractions can beespecially difficult to construct because the precise holdings of each playerdirectly impact move http://values.In在本文中，我们将探讨使用深度神经网络（DNN）从人类游戏数据中无监督学习策略在Skat游戏中的应用。我们通过引入直接调整出价者 aggression 和基于期望值声明游戏的方法，解决了极少观察到的状态-动作对问题，从而实现了更先进的状态评估系统。虽然通过模仿学习的策略比当前最好的搜索方法要弱一些，但它们的运行速度却快得多，这是它们的主要优势所在。此外，我们还探讨了这些策略如何在强化学习环境中直接从经验中学习，并讨论了将人类数据纳入此任务的价值。

【7】 SAI: a Sensible Artificial Intelligence that plays with handicap and targets high scores in 9×9 Go (extended version)

标题：SAI：一种明智的人工智能，可以在9×9 Go（扩展版）中使用障碍并获得高分

作者： Francesco Morandin, Maurizio Parton

链接：https://arxiv.org/abs/1905.10863摘要： We develop a new model that can be applied to any perfect informationtwo-player zero-sum game to target a high score, and thus a perfect play. Weintegrate this model into the Monte Carlo tree search-policy iteration learningpipeline introduced by Google DeepMind with AlphaGo. Training this model on 9x9Go produces a superhuman Go player, thus proving that it is stable and robust.We show that this model can be used to effectively play with both positionaland score handicap. We develop a family of agents that can target high scoresagainst any opponent, and recover from very severe disadvantage against weakopponents. To the best of our knowledge, these are the first effectiveachievements in this direction.

【8】 Path Ranking with Attention to Type Hierarchies

标题：注意类型层次结构的路径排名

作者： Weiyu Liu, Sonia Chernova

链接：https://arxiv.org/abs/1905.10799摘要： The knowledge base completion problem is the problem of inferring missinginformation from existing facts in knowledge bases. Path-ranking based methodsuse sequences of relations as general patterns of paths for prediction.However, these patterns usually lack accuracy because they are generic and canoften apply to widely varying scenarios. We leverage type hierarchies ofentities to create a new class of path patterns that are both discriminativeand generalizable. Then we propose an attention-based RNN model, which can betrained end-to-end, to discover the new path patterns most suitable for thedata. Experiments conducted on two benchmark knowledge base completion datasetsdemonstrate that the proposed model outperforms existing methods by astatistically significant margin. Our quantitative analysis of the pathpatterns shows that they balance between generalization and discrimination.

【9】 Ensemble Decision Systems for General Video Game Playing

标题：一般视频游戏的集合决策系统

作者： Damien Anderson, John Levine

备注：8 Pages, Accepted at COG2019

链接：https://arxiv.org/abs/1905.10792摘要： Ensemble Decision Systems offer a unique form of decision making that allowsa collection of algorithms to reason together about a problem. Each individualalgorithm has its own inherent strengths and weaknesses, and often it isdifficult to overcome the weaknesses while retaining the strengths. Instead ofaltering the properties of the algorithm, the Ensemble Decision System augmentsthe performance with other algorithms that have complementing strengths. Thiswork outlines different options for building an Ensemble Decision System aswell as providing analysis on its performance compared to the individualcomponents of the system with interesting results, showing an increase in thegenerality of the algorithms without significantly impeding performance.

【10】 MDE: Multi Distance Embeddings for Link Prediction in Knowledge Graphs

标题：MDE：用于知识图中链接预测的多距离嵌入

作者： Afshin Sadeghi, Jens Lehmann

链接：https://arxiv.org/abs/1905.10702摘要： Over the past decade, knowledge graphs became popular for capturingstructured domain knowledge. Relational learning models enable the predictionof missing links inside knowledge graphs. More specifically, latent distanceapproaches model the relationships among entities via a distance between latentrepresentations. Translating embedding models (e.g., TransE) are among the mostpopular latent distance approaches which use one distance function to learnmultiple relation patterns. However, they are not capable of capturingsymmetric relations. They also force relations with reflexive patterns tobecome symmetric and transitive. In order to improve distance based embedding,we propose multi-distance embeddings (MDE). Our solution is based on the ideathat by learning independent embedding vectors for each entity and relation onecan aggregate contrasting distance functions. Benefiting from MDE, we alsodevelop supplementary distances resolving the above-mentioned limitations ofTransE. We further propose an extended loss function for distance basedembeddings and show that MDE and TransE are fully expressive using this lossfunction. Furthermore, we obtain a bound on the size of their embeddings forfull expressivity. Our empirical results show that MDE significantly improvesthe translating embeddings and outperforms several state-of-the-art embeddingmodels on benchmark datasets.

【11】 Balancing Goal Obfuscation and Goal Legibility in Settings with Cooperative and Adversarial Observers

标题：使用合作和对抗观察者在设置中平衡目标混淆和目标可读性

作者： Anagha Kulkarni, Subbarao Kambhampati

链接：https://arxiv.org/abs/1905.10672摘要： In order to be useful in the real world, AI agents need to plan and act inthe presence of others, who may include adversarial and cooperative http://entities.Inthis paper, we consider the problem where an autonomous agent needs to actin a manner that clarifies its objectives to cooperative entities whilepreventing adversarial entities from inferring those objectives. We show thatthis problem is solvable when cooperative entities and adversarial entities usedifferent types of sensors and/or prior knowledge. We develop two new solutionapproaches for computing such plans. One approach provides an optimal solutionto the problem by using an IP solver to provide maximum obfuscation foradversarial entities while providing maximum legibility for cooperativeentities in the environment, whereas the other approach provides a satisficingsolution using heuristic-guided forward search to achieve preset levels ofobfuscation and legibility for adversarial and cooperative entitiesrespectively. We show the feasibility and utility of our algorithms throughextensive empirical evaluation on problems derived from planning benchmarks.

【12】 Dynamic Epistemic Logic with ASP Updates: Application to Conditional Planning

标题：ASP更新的动态认知逻辑：条件规划的应用

作者： Pedro Cabalar, Luis Fariñas del Cerro

链接：https://arxiv.org/abs/1905.10621摘要： Dynamic Epistemic Logic (DEL) is a family of multimodal logics that hasproved to be very successful for epistemic reasoning in planning tasks. In thislogic, the agents knowledge is captured by modal epistemic operators whereasthe system evolution is described in terms of (some subset of) dynamic logicmodalities in which actions are usually represented as semantic objects calledevent models. In this paper, we study a variant of DEL, that wecall DEL[ASP],where actions are syntactically described by using an Answer Set Programming(ASP) representation instead of event models. This representation directlyinherits high level expressive features like indirect effects, qualifications,state constraints, defaults, or recursive fluents that are common in ASPdescriptions of action domains. Besides, we illustrate how this approach can beapplied for obtaining conditional plans in single-agent, partially observabledomains where knowledge acquisition may be represented as indirect effects ofactions.

【13】 Should I Include this Edge in my Prediction? Analyzing the Stability-Performance Tradeoff

标题：我应该在预测中包含此边缘吗？分析稳定性 – 性能权衡

作者： Adarsh Subbaswamy, Suchi Saria

链接：https://arxiv.org/abs/1905.11374摘要： Recent work addressing model reliability and generalization has resulted in avariety of methods that seek to proactively address differences between thetraining and unknown target environments. While most methods achieve this byfinding distributions that will be invariant across environments, we will showthey do not necessarily find the same distributions which has implications forperformance. In this paper we unify existing work on prediction using stabledistributions by relating environmental shifts to edges in the graph underlyinga prediction problem, and characterize stable distributions as those whicheffectively remove these edges. We then quantify the effect of edge deletion onperformance in the linear case and corroborate the findings in a simulated andreal data experiment.

【14】 Object Discovery with a Copy-Pasting GAN

标题：使用复制粘贴GAN进行对象发现

作者： Relja Arandjelović, Andrew Zisserman

链接：https://arxiv.org/abs/1905.11369摘要： We tackle the problem of object discovery, where objects are segmented for agiven input image, and the system is trained without using any directsupervision whatsoever. A novel copy-pasting GAN framework is proposed, wherethe generator learns to discover an object in one image by compositing it intoanother image such that the discriminator cannot tell that the resulting imageis fake. After carefully addressing subtle issues, such as preventing thegenerator from `cheating, this game results in the generator learning toselect objects, as copy-pasting objects is most likely to fool thediscriminator. The system is shown to work well on four very differentdatasets, including large object appearance variations in challenging clutteredbackgrounds.

【15】 Straight to Shapes++: Real-time Instance Segmentation Made More Accurate

标题：直线形状++：实时实例分割更加准确

作者： Laurynas Miksys, Philip H.S. Torr

链接：https://arxiv.org/abs/1905.11358摘要： Instance segmentation is an important problem in computer vision, withapplications in autonomous driving, drone navigation and robotic manipulation.However, most existing methods are not real-time, complicating their deploymentin time-sensitive contexts. In this work, we extend an existing approach toreal-time instance segmentation, called `Straight to Shapes (STS), which makesuse of low-dimensional shape embedding spaces to directly regress to objectshape masks. The STS model can run at 35 FPS on a high-end desktop, but itsaccuracy is significantly worse than that of offline state-of-the-art methods.We leverage recent advances in the design and training of deep instancesegmentation models to improve the performance accuracy of the STS model whilstkeeping its real-time capabilities intact. In particular, we find thatparameter sharing, more aggressive data augmentation and the use of structuredloss for shape mask prediction all provide a useful boost to the networkperformance. Our proposed approach, `Straight to Shapes++, achieves aremarkable 19.7 point improvement in mAP (at IOU of 0.5) over the originalmethod as evaluated on the PASCAL VOC dataset, thus redefining the accuracyfrontier at real-time speeds. Since the accuracy of instance segmentation isclosely tied to that of object bounding box prediction, we also study the errorprofile of the latter and examine the failure modes of our method for futureimprovements.

【16】 AgentGraph: Towards Universal Dialogue Management with Structured Deep Reinforcement Learning

标题：AgentGraph：通过结构化深层强化学习实现普遍对话管理

作者： Lu Chen, Kai Yu

链接：https://arxiv.org/abs/1905.11259摘要： Dialogue policy plays an important role in task-oriented spoken dialoguesystems. It determines how to respond to users. The recently proposed deepreinforcement learning (DRL) approaches have been used for policy optimization.However, these deep models are still challenging for two reasons: 1) ManyDRL-based policies are not sample-efficient. 2) Most models dont have thecapability of policy transfer between different domains. In this paper, wepropose a universal framework, AgentGraph, to tackle these two problems. Theproposed AgentGraph is the combination of GNN-based architecture and DRL-basedalgorithm. It can be regarded as one of the multi-agent reinforcement learningapproaches. Each agent corresponds to a node in a graph, which is definedaccording to the dialogue domain ontology. When making a decision, each agentcan communicate with its neighbors on the graph. Under AgentGraph framework, wefurther propose Dual GNN-based dialogue policy, which implicitly decomposes thedecision in each turn into a high-level global decision and a low-level localdecision. Experiments show that AgentGraph models significantly outperformtraditional reinforcement learning approaches on most of the 18 tasks of thePyDial benchmark. Moreover, when transferred from the source task to a targettask, these models not only have acceptable initial performance but alsoconverge much faster on the target task.

【17】 A Novel Demodulation and Estimation Algorithm for Blackout Communication: Extract Principal Components with Deep Learning

标题：一种新的停电通信解调和估计算法：用深度学习提取主成分

作者： Haoyan Liu, Ming Yang

链接：https://arxiv.org/abs/1905.11229摘要： For reentry or near space communication, owing to the influence of thetime-varying plasma sheath channel environment, the received IQ basebandsignals are severely rotated on the constellation. Researches have shown thatthe frequency of electron density varies from 20kHz to 100 kHz which is on thesame order as the symbol rate of most TT\&C communication systems and a mass ofbandwidth will be consumed to track the time-varying channel with traditionalestimation. In this paper, motivated by principal curve analysis, we propose adeep learning (DL) algorithm which called symmetric manifold network (SMN) toextract the curves on the constellation and classify the signals based on thecurves. The key advantage is that SMN can achieve joint optimization ofdemodulation and channel estimation. From our simulation results, the newalgorithm significantly reduces the symbol error rate (SER) compared toexisting algorithms and enables accurate estimation of fading with extremelyhigh bandwith utilization rate.

【18】 Model-Agnostic Counterfactual Explanations for Consequential Decisions

标题：对后果决策的模型不可知的反事实解释

作者： Amir-Hossein Karimi, Isabel Valera

链接：https://arxiv.org/abs/1905.11190摘要： Predictive models are being increasingly used to support consequentialdecision making at the individual level in contexts such as pretrial bail andloan approval. As a result, there is increasing social and legal pressure toprovide explanations that help the affected individuals not only to understandwhy a prediction was output, but also how to act to obtain a desired outcome.To this end, several works have proposed methods to generate counterfactualexplanations. However, they are often restricted to a particular subset ofmodels (e.g., decision trees or linear models), and cannot directly handle themixed (numerical and nominal) nature of the features descri**Bing** eachindividual. In this paper, we propose a model-agnostic algorithm to generatecounterfactual explanations that builds on the standard theory and tools fromformal verification. Specifically, our algorithm solves a sequence ofsatisfiability problems, where a wide variety of predictive models anddistances in mixed feature spaces, as well as natural notions of plausibilityand diversity, are represented as logic formulas. Our experiments on real-worlddata demonstrate that our approach can flexibly handle widely deployedpredictive models, while providing meaningfully closer counterfactuals thanexisting approaches.

【19】 Physics-as-Inverse-Graphics: Joint Unsupervised Learning of Objects and Physics from Video

标题：物理 – 逆 – 图形：视频中物体和物理的联合无监督学习

作者： Miguel Jaques, Timothy Hospedales

链接：https://arxiv.org/abs/1905.11169摘要： We aim to perform unsupervised discovery of objects and their states such aslocation and velocity, as well as physical system parameters such as mass andgravity from video — given only the differential equations governing the scenedynamics. Existing physical scene understanding methods require either objectstate supervision, or do not integrate with differentiable physics to learninterpretable system parameters and states. We address this problem through a$\textit{physics-as-inverse-graphics}$ approach that brings togethervision-as-inverse-graphics and differentiable physics engines. This frameworkallows us to perform long term extrapolative video prediction, as well asvision-based model-predictive control. Our approach significantly outperformsrelated unsupervised methods in long-term future frame prediction of systemswith interacting objects (such as ball-spring or 3-body gravitational systems).We further show the value of this tight vision-physics integration bydemonstrating data-efficient learning of vision-actuated model-based controlfor a pendulum system. The controllers interpretability also provides uniquecapabilities in goal-driven control and physical reasoning for zero-dataadaptation.

【20】 Finding Task-Relevant Features for Few-Shot Learning by Category Traversal

标题：通过类别遍历找到针对少数镜头学习的任务相关特征

作者： Hongyang Li, Xiaogang Wang

备注：CVPR 2019

链接：https://arxiv.org/abs/1905.11116摘要： Few-shot learning is an important area of research. Conceptually, humans arereadily able to understand new concepts given just a few examples, while inmore pragmatic terms, limited-example training situations are common inpractice. Recent effective approaches to few-shot learning employ ametric-learning framework to learn a feature similarity comparison between aquery (test) example, and the few support (training) examples. However, theseapproaches treat each support class independently from one another, neverlooking at the entire task as a whole. Because of this, they are constrained touse a single set of features for all possible test-time tasks, which hindersthe ability to distinguish the most relevant dimensions for the task at http://hand.Inthis work, we introduce a Category Traversal Module that can be inserted asa plug-and-play module into most metric-learning based few-shot learners. Thiscomponent traverses across the entire support set at once, identifyingtask-relevant features based on both intra-class commonality and inter-classuniqueness in the feature space. Incorporating our module improves performanceconsiderably (5%-10% relative) over baseline systems on both mini-ImageNet andtieredImageNet benchmarks, with overall performance competitive with recentstate-of-the-art systems.

【21】 Robustness of accelerated first-order algorithms for strongly convex optimization problems

标题：强凸优化问题的加速一阶算法的鲁棒性

作者： Hesameddin Mohammadi, Mihailo R. Jovanović

链接：https://arxiv.org/abs/1905.11011摘要： We study the robustness of accelerated first-order algorithms to stochasticuncertainties in gradient evaluation. Specifically, for unconstrained, smooth,strongly convex optimization problems, we examine the mean-square error in theoptimization variable when the iterates are perturbed by additive white noise.This type of uncertainty may arise in situations where an approximation of thegradient is sought through measurements of a real system or in a distributedcomputation over network. Even though the underlying dynamics of first-orderalgorithms for this class of problems are nonlinear, we establish upper boundson the mean-square deviation from the optimal value that are tight up toconstant factors. Our analysis quantifies fundamental trade-offs between noiseamplification and convergence rates obtained via any acceleration schemesimilar to Nesterovs or heavy-ball methods. To gain additional analyticalinsight, for strongly convex quadratic problems we explicitly evaluate thesteady-state variance of the optimization variable in terms of the eigenvaluesof the Hessian of the objective function. We demonstrate that the entirespectrum of the Hessian, rather than just the extreme eigenvalues, influencerobustness of noisy algorithms. We specialize this result to the problem ofdistributed averaging over undirected networks and examine the role of networksize and topology on the robustness of noisy accelerated algorithms.

【22】 Commonsense Properties from Query Logs and Question Anwering Forums

标题：查询日志和问题回答论坛中的常识属性

作者： Julien Romero, Gerhard Weikum

链接：https://arxiv.org/abs/1905.10989摘要： Commonsense knowledge about object properties, human behavior and generalconcepts is crucial for robust AI applications. However, automatic acquisitionof this knowledge is challenging because of sparseness and bias in onlinesources. This paper presents Quasimodo, a methodology and tool suite fordistilling commonsense properties from non-standard web sources. We devisenovel ways of tapping into search-engine query logs and QA forums, andcombining the resulting candidate assertions with statistical cues fromencyclopedias, books and image tags in a corroboration step. Unlike prior workon commonsense knowledge bases, Quasimodo focuses on salient properties thatare typically associated with certain objects or concepts. Extensiveevaluations, including extrinsic use-case studies, show that Quasimodo providesbetter coverage than state-of-the-art baselines with comparable quality.

【23】 An Intelligent Monitoring System of Vehicles on Highway Traffic

标题：公路交通车辆智能监控系统

作者： Sulaiman Khan, Mohammad Farhad Bulbul

链接：https://arxiv.org/abs/1905.10982摘要： Vehicle speed monitoring and management of highways is the critical problemof the road in this modern age of growing technology and population. A poormanagement results in frequent traffic jam, traffic rules violation and fatalroad accidents. Using traditional techniques of RADAR, LIDAR and LASAR toaddress this problem is time-consuming, expensive and tedious. This paperpresents an efficient framework to produce a simple, cost efficient andintelligent system for vehicle speed monitoring. The proposed method uses an HD(High Definition) camera mounted on the road side either on a pole or on atraffic signal for recording video frames. On the basis of these frames, avehicle can be tracked by using radius growing method, and its speed can becalculated by calculating vehicle mask and its displacement in consecutiveframes. The method uses pattern recognition, digital image processing andmathematical techniques for vehicle detection, tracking and speed calculation.The validity of the proposed model is proved by testing it on differenthighways.

【24】 Explainable Reinforcement Learning Through a Causal Lens

标题：通过因果透镜解释强化学习

作者： Prashan Madumal, Frank Vetere

链接：https://arxiv.org/abs/1905.10958摘要： Prevalent theories in cognitive science propose that humans understand andrepresent the knowledge of the world through causal relationships. In makingsense of the world, we build causal models in our mind to encode cause-effectrelations of events and use these to explain why new events happen. In thispaper, we use causal models to derive causal explanations of behaviour ofreinforcement learning agents. We present an approach that learns a structuralcausal model during reinforcement learning and encodes causal relationshipsbetween variables of interest. This model is then used to generate explanationsof behaviour based on counterfactual analysis of the causal model. We report ona study with 120 participants who observe agents playing a real-time strategygame (Starcraft II) and then receive explanations of the agents behaviour. Weinvestigated: 1) participants understanding gained by explanations throughtask prediction; 2) explanation satisfaction and 3) trust. Our results showthat causal model explanations perform better on these measures compared to twoother baseline explanation models.

【25】 Naive probability

标题：天真的概率

作者： Zalan Gyenis, Andras Kornai

链接：https://arxiv.org/abs/1905.10924摘要： We describe a rational, but low resolution model of probability.

【26】 Applying Abstract Argumentation Theory to Cooperative Game Theory

标题：将抽象论证理论应用于合作博弈论

作者： Anthony P. Young, Josh Murphy

链接：https://arxiv.org/abs/1905.10922摘要： We apply ideas from abstract argumentation theory to study cooperative gametheory. Building on Dungs results in his seminal paper, we further thecorrespondence between Dungs four argumentation semantics and solutionconcepts in cooperative game theory by showing that complete extensions (thegrounded extension) correspond to Roths subsolutions (respectively, thesupercore). We then investigate the relationship between well-foundedargumentation frameworks and convex games, where in each case the semantics(respectively, solution concepts) coincide; we prove that three-player convexgames do not in general have well-founded argumentation frameworks.

【27】 Adaptive Learning Material Recommendation in Online Language Education

标题：在线语言教育中的自适应学习材料推荐

作者： Shuhan Wang, Erik Andersen

备注：The short version of this paper is published at AIED 2019

链接：https://arxiv.org/abs/1905.10893摘要： Recommending personalized learning materials for online language learning ischallenging because we typically lack data about the students ability and therelative difficulty of learning materials. This makes it hard to recommendappropriate content that matches the students prior knowledge. In this paper,we propose a refined hierarchical knowledge structure to model vocabularyknowledge, which enables us to automatically organize the authentic andup-to-date learning materials collected from the internet. Based on thisknowledge structure, we then introduce a hybrid approach to recommend learningmaterials that adapts to a students language level. We evaluate our work withan online Japanese learning tool and the results suggest adding adaptivity intomaterial recommendation significantly increases student engagement.

【28】 Simple and Effective Curriculum Pointer-Generator Networks for Reading Comprehension over Long Narratives

标题：简单有效的课程指针生成器网络阅读理解长篇叙事

作者： Yi Tay, Aston Zhang

备注：Accepted to ACL 2019

链接：https://arxiv.org/abs/1905.10847摘要： This paper tackles the problem of reading comprehension over long narrativeswhere documents easily span over thousands of tokens. We propose a curriculumlearning (CL) based Pointer-Generator framework for reading/sampling over largedocuments, enabling diverse training of the neural model based on the notion ofalternating contextual difficulty. This can be interpreted as a form of domainrandomization and/or generative pretraining during training. To this end, theusage of the Pointer-Generator softens the requirement of having the answerwithin the context, enabling us to construct diverse training samples forlearning. Additionally, we propose a new Introspective Alignment Layer (IAL),which reasons over decomposed alignments using block-based self-attention. Weevaluate our proposed method on the NarrativeQA reading comprehensionbenchmark, achieving state-of-the-art performance, improving existing baselinesby $51\%$ relative improvement on BLEU-4 and $17\%$ relative improvement onRouge-L. Extensive ablations confirm the effectiveness of our proposed IAL andCL components.

【29】 Learning to Optimize Computational Resources: Frugal Training with Generalization Guarantees

标题：学会优化计算资源：广义保障的节俭培训

作者： Maria-Florina Balcan, Ellen Vitercik

链接：https://arxiv.org/abs/1905.10819摘要： Algorithms typically come with tunable parameters that have a considerableimpact on the computational resources they consume. Too often, practitionersmust hand-tune the parameters, a tedious and error-prone task. A recent line ofresearch provides algorithms that return nearly-optimal parameters from withina finite set. These algorithms can be used when the parameter space is infiniteby providing as input a random sample of parameters. This data-independentdiscretization, however, might miss pockets of nearly-optimal parameters: priorresearch has presented scenarios where the only viable parameters lie within anarbitrarily small region. We provide an algorithm that learns a finite set ofpromising parameters from within an infinite set. Our algorithm can helpcompile a configuration portfolio, or it can be used to select the input to aconfiguration algorithm for finite parameter spaces. Our approach applies toany configuration problem that satisfies a simple yet ubiquitous structure: thealgorithms performance is a piecewise constant function of its parameters.Prior research has exhibited this structure in domains from integer programmingto clustering. For these types of combinatorial problems, this is the firstconfiguration algorithm beyond exhaustive search whose output provably competeswith the best parameters from an infinite space.

【30】 Unsupervised Intuitive Physics from Past Experiences

标题：过去经验中的无监督直觉物理学

作者： Sébastien Ehrhardt, Andrea Vedaldi

链接：https://arxiv.org/abs/1905.10793摘要： We are interested in learning models of intuitive physics similar to the onesthat animals use for navigation, manipulation and planning. In addition tolearning general physical principles, however, we are also interested inlearning “on the fly, from a few experiences, physical properties specificto new environments. We do all this in an unsupervised manner, using ameta-learning formulation where the goal is to predict videos containingdemonstrations of physical phenomena, such as objects moving and colliding witha complex background. We introduce the idea of summarizing past experiences ina very compact manner, in our case using dynamic images, and show that this canbe used to solve the problem well and efficiently. Empirically, we show viaextensive experiments and ablation studies, that our model learns to performphysical predictions that generalize well in time and space, as well as to avariable number of interacting physical objects.

【31】 Dual Averaging Method for Online Graph-structured Sparsity

标题：在线图形结构稀疏性的双平均法

作者： Baojian Zhou, Yiming Ying

链接：https://arxiv.org/abs/1905.10714摘要： Online learning algorithms update models via one sample per iteration, thusefficient to process large-scale datasets and useful to detect malicious eventsfor social benefits, such as disease outbreak and traffic congestion on thefly. However, existing algorithms for graph-structured models focused on theoffline setting and the least square loss, incapable for online setting, whilemethods designed for online setting cannot be directly applied to the problemof complex (usually non-convex) graph-structured sparsity model. To addressthese limitations, in this paper we propose a new algorithm forgraph-structured sparsity constraint problems under online setting, which wecall \textsc{GraphDA}. The key part in \textsc{GraphDA} is to project bothaveraging gradient (in dual space) and primal variables (in primal space) ontolower dimensional subspaces, thus capturing the graph-structured sparsityeffectively. Furthermore, the objective functions assumed here are generallyconvex so as to handle different losses for online learning settings. To thebest of our knowledge, \textsc{GraphDA} is the first online learning algorithmfor graph-structure constrained optimization problems. To validate our method,we conduct extensive experiments on both benchmark graph and real-world graphdatasets. Our experiment results show that, compared to other baseline methods,\textsc{GraphDA} not only improves classification performance, but alsosuccessfully captures graph-structured features more effectively, hencestronger interpretability.

【32】 A Lipschitz-constrained anomaly discriminator framework

标题：Lipschitz约束的异常鉴别器框架

作者： Alexander Tong, Smita Krishnaswamy

链接：https://arxiv.org/abs/1905.10710摘要： Anomaly detection is a problem of great interest in medicine, finance, andother fields where error and fraud need to be detected and corrected. Most deepanomaly detection methods rely on autoencoder reconstruction error. However, weshow that this approach has limited value. First, this approach starts toperform poorly when either noise or anomalies contaminate training data, evento a small extent. Second, this approach cannot detect anomalous but simple toreconstruct points. This can be seen even in relatively simple examples, suchas feeding a black image to detectors trained on MNIST digits. Here, weintroduce a new discriminator-based unsupervised Lipschitz anomaly detector(LAD). We train a Wasserstein discriminator, similar to the ones used in GANs,to detect the difference between the training data and corruptions of thetraining data. We show that this procedure successfully detects unseenanomalies with guarantees on those that have a certain Wasserstein distancefrom the data or corrupted training set. Finally, we show results of thissystem in an electronic medical record dataset of HIV-positive veterans fromthe veterans aging cohort study (VACS) to establish usability in a medicalsetting.

【33】 Tell Me About Yourself: Using an AI-Powered Chatbot to Conduct Conversational Surveys

标题：告诉我你自己：使用AI-Powered聊天机器人进行对话调查

作者： Ziang Xiao, Huahai Yang

链接：https://arxiv.org/abs/1905.10700摘要： The rise of increasingly more powerful chatbots offers a new way to collectinformation through conversational surveys, where a chatbot asks open-endedquestions, interprets a users free-text responses, and probes answers whenneeded. To investigate the effectiveness and limitations of such a chatbot inconducting surveys, we conducted a field study involving about 600participants. In this study, half of the participants took a typical onlinesurvey on Qualtrics and the other half interacted with an AI-powered chatbot tocomplete a conversational survey. Our detailed analysis of over 5200 free-textresponses revealed that the chatbot drove a significantly higher level ofparticipant engagement and elicited significantly better quality responses interms of relevance, depth, and readability. Based on our results, we discussdesign implications for creating AI-powered chatbots to conduct effectivesurveys and beyond.

【34】 HINT: Hierarchical Invertible Neural Transport for General and Sequential Bayesian inference

标题：提示：一般和顺序贝叶斯推断的分层可逆神经传输

作者： Gianluca Detommaso, Robert Scheichl

链接：https://arxiv.org/abs/1905.10687摘要： In this paper, we introduce Hierarchical Invertible Neural Transport (HINT),an algorithm that merges Invertible Neural Networks and optimal transport tosample from a posterior distribution in a Bayesian framework. This methodexploits a hierarchical architecture to construct a Knothe-Rosenblatt transportmap between an arbitrary density and the joint density of hidden variables andobservations. After training the map, samples from the posterior can beimmediately recovered for any contingent observation. Any underlying modelevaluation can be performed fully offline from training without the need of amodel-gradient. Furthermore, no analytical evaluation of the prior isnecessary, which makes HINT an ideal candidate for sequential Bayesianinference. We demonstrate the efficacy of HINT on two numerical experiments.

【35】 Composing Ensembles of Policies with Deep Reinforcement Learning

标题：用深度强化学习编写政策集合

作者： Ahmed H. Qureshi, Michael C. Yip

链接：https://arxiv.org/abs/1905.10681摘要： Composition of elementary skills into complex behaviors to solve challengingproblems is one of the key elements toward building intelligent machines. Todate, there has been plenty of work on learning new policies or skills butalmost no focus on composing them to perform complex decision-making. In thispaper, we propose a policy ensemble composition framework that takes therobots primitive policies and learns to compose them concurrently orsequentially through reinforcement learning. We evaluate our method in problemswhere traditional approaches either fail or exhibit high sample complexity tofind a solution. We show that our method not only solves the problems thatrequire both task and motion planning but also exhibits high data efficiency,which is currently one of the main limitations of reinforcement learning.

【36】 Compositional Fairness Constraints for Graph Embeddings

标题：图嵌入的组合公平约束

作者： Avishek Joey Bose, William Hamilton

备注：Proceedings of the 36th International Conference on Machine Learning, Long Beach, California, PMLR 97, 2019

链接：https://arxiv.org/abs/1905.10674摘要： Learning high-quality node embeddings is a key building block for machinelearning models that operate on graph data, such as social networks andrecommender systems. However, existing graph embedding techniques are unable tocope with fairness constraints, e.g., ensuring that the learned representationsdo not correlate with certain attributes, such as age or gender. Here, weintroduce an adversarial framework to enforce fairness constraints on graphembeddings. Our approach is {\em compositional}—meaning that it can flexiblyaccommodate different combinations of fairness constraints during inference.For instance, in the context of social recommendations, our framework wouldallow one user to request that their recommendations are invariant to boththeir age and gender, while also allowing another user to request invariance tojust their age.Experiments on standard knowledge graph and recommender system benchmarkshighlight the utility of our proposed framework.

【37】 DIANet: Dense-and-Implicit Attention Network

标题：DIANet：密集和隐含的注意力网络

作者： Zhongzhan Huang, Haizhao Yang

链接：https://arxiv.org/abs/1905.10671摘要： Attention-based deep neural networks (DNNs) that emphasize the informativeinformation in a local receptive field of an input image have successfullyboosted the performance of deep learning in various challenging problems. Inthis paper, we propose a Dense-and-Implicit-Attention (DIA) unit that can beapplied universally to different network architectures and enhance theirgeneralization capacity by repeatedly fusing the information throughoutdifferent network layers. The communication of information between differentlayers is carried out via a modified Long Short Term Memory (LSTM) modulewithin the DIA unit that is in parallel with the DNN. The sharing DIA unitlinks multi-scale features from different depth levels of the networkimplicitly and densely. Experiments on benchmark datasets show that the DIAunit is capable of emphasizing channel-wise feature interrelation and leads tosignificant improvement of image classification accuracy. We furtherempirically show that the DIA unit is a nonlocal normalization tool thatenhances the Batch Normalization. The code is released atthis https URL.

【38】 ESA: Entity Summarization with Attention

标题：ESA：注意的实体摘要

作者： Dongjun Wei, Yaxin Liu

链接：https://arxiv.org/abs/1905.10625摘要： Entity summarization aims at creating brief but informative descriptions ofentities from knowledge graphs. While previous work mostly focused ontraditional techniques such as clustering algorithms and graph models, we askhow to apply deep learning methods into this task. In this paper we proposeESA, a neural network with supervised attention mechanisms for entitysummarization. Specifically, we calculate attention weights for facts in eachentity, and rank facts to generate reliable summaries. We explore techniques tosolve difficult learning problems presented by the ESA, and demonstrate theeffectiveness of our model in comparison with the state-of-the-art methods.Experimental results show that our model improves the quality of the entitysummaries in both F-measure and MAP.

【39】 Adversarial Policies: Attacking Deep Reinforcement Learning

标题：对抗政策：攻击深层强化学习

作者： Adam Gleave, Stuart Russell

备注：Under review at NeurIPS 2019

链接：https://arxiv.org/abs/1905.10615摘要： Deep reinforcement learning (RL) policies are known to be vulnerable toadversarial perturbations to their observations, similar to adversarialexamples for classifiers. However, an attacker is not usually able to directlymodify another agents observations. This might lead one to wonder: is itpossible to attack an RL agent simply by choosing an adversarial policy actingin a multi-agent environment so as to create natural observations that areadversarial? We demonstrate the existence of adversarial policies in zero-sumgames between simulated humanoid robots with proprioceptive observations,against state-of-the-art victims trained via self-play to be robust toopponents. The adversarial policies reliably win against the victims butgenerate seemingly random and uncoordinated behavior. We find that thesepolicies are more successful in high-dimensional environments, and inducesubstantially different activations in the victim policy network than when thevictim plays against a normal opponent. Videos are available atthis http URL.

【40】 ASPIRE: Automated Security Policy Implementation Using Reinforcement Learning

标题：ASPIRE：使用强化学习实现自动安全策略

作者： Yoni Birman, Asaf Shabtai

链接：https://arxiv.org/abs/1905.10517摘要： Malware detection is an ever-present challenge for all organizationalgatekeepers. Organizations often deploy numerous different malware detectiontools, and then combine their output to produce a final classification for aninspected file. This approach has two significant drawbacks. First, it requireslarge amounts of computing resources and time since every incoming file needsto be analyzed by all detectors. Secondly, it is difficult to accurately anddynamically enforce a predefined security policy that comports with the needsof each organization (e.g., how tolerant is the organization to false negativesand false positives). In this study we propose ASPIRE, a reinforcement learning(RL)-based method for malware detection. Our approach receives theorganizational policy — defined solely by the perceived costs ofcorrect/incorrect classifications and of computing resources — and thendynamically assigns detection tools and sets the detection threshold for eachinspected file. We demonstrate the effectiveness and robustness of our approachby conducting an extensive evaluation on multiple organizational policies.ASPIRE performed well in all scenarios, even achieving near-optimal accuracy of96.21% (compared to an optimum of 96.86%) at approximately 20% of the runningtime of this baseline.

【41】 Resisting Adversarial Attacks by $k$-Winners-Take-All

标题：以$ k $ -Winners-Take-All抵抗对抗性攻击

作者： Chang Xiao, Changxi Zheng

链接：https://arxiv.org/abs/1905.10510摘要： We propose a simple change to the current neural network structure fordefending against gradient-based adversarial attacks. Instead of using popularactivation functions (such as ReLU), we advocate the use of$k$-Winners-Take-All ($k$-WTA) activation, a $C^0$ discontinuous function thatpurposely invalidates the neural network models gradient at denselydistributed input data points. Our proposal is theoretically rationalized. Weshow why the discontinuities in $k$-WTA networks can largely preventgradient-based search of adversarial examples and why they at the same timeremain innocuous to the network training. This understanding is alsoempirically backed. Even without notoriously expensive adversarial training,the robustness performance of our networks is comparable to conventional ReLUnetworks optimized by adversarial training. Furthermore, after also optimizedthrough adversarial training, our networks outperform the state-of-the-artmethods under white-box attacks on various datasets that we experimented with.

【42】 Learning to Reason in Large Theories without Imitation

标题：在没有模仿的大理论中学习理性

作者： Kshitij Bansal, Christian Szegedy

链接：https://arxiv.org/abs/1905.10501摘要： Automated theorem proving in large theories can be learned via reinforcementlearning over an indefinitely growing action space. In order to select actions,one performs nearest neighbor lookups in the knowledge base to find premises tobe applied. Here we address the exploration for reinforcement learning in thisspace. Approaches (like epsilon-greedy strategy) that sample actions uniformlydo not scale to this scenario as most actions lead to dead ends andunsuccessful proofs which are not useful for training our models. In thispaper, we compare approaches that select premises using randomly initializedsimilarity measures and mixing them with the proposals of the learned model. Weevaluate these on the HOList benchmark for tactics based higher order theoremproving. We implement an automated theorem prover named DeepHOL-Zero that doesnot use any of the human proofs and show that our improved exploration methodmanages to expand the training set continuously. DeepHOL-Zero outperforms thebest theorem prover trained by imitation learning alone.

【43】 Finding new routes for integrating Multi-Agent Systems using Apache Camel

标题：使用Apache Camel查找集成多代理系统的新路由

作者： Cleber Jorge Amaral, Maicon Rafael Zatelli

链接：https://arxiv.org/abs/1905.10490摘要： In Multi-Agent Systems (MAS) there are two main models of interaction: amongagents, and between agents and the environment. Although there are studiesconsidering these models, there is no practical tool to afford the interactionwith external entities with both models. This paper presents a proposal forsuch a tool based on the Apache Camel framework by designing two newcomponents, namely camel-jason and camel-artifact. By means of thesecomponents, an external entity is modelled according to its nature, i.e.,whether it is autonomous or non-autonomous, interacting with the MASrespectively as an agent or an artifact. It models coherently external entitieswhereas Camel provides interoperability with several communication protocols.

【44】 Human vs. Muppet: A Conservative Estimate of HumanPerformance on the GLUE Benchmark

标题：Human vs. Muppet：GLUE基准上对人类表现的保守估计

作者： Nikita Nangia, Samuel R. Bowman

链接：https://arxiv.org/abs/1905.10425摘要： The GLUE benchmark (Wang et al., 2019b) is a suite of language understandingtasks which has seen dramatic progress in the past year, with averageperformance moving from 70.0 at launch to 83.9, state of the art at the time ofwriting (May 24, 2019). Here, we measure human performance on the benchmark, inorder to learn whether significant headroom remains for further progress. Weprovide a conservative estimate of human performance on the benchmark throughcrowdsourcing: Our annotators are non-experts who must learn each task from abrief set of instructions and 20 examples. In spite of limited training, theseannotators robustly outperform the state of the art on six of the nine GLUEtasks and achieve an average score of 87.1. Given the fast pace of progresshowever, the headroom we observe is quite limited. To reproduce the data-poorsetting that our annotators must learn in, we also train the BERT model (Devlinet al., 2019) in limited-data regimes, and conclude that low-resource sentenceclassification remains a challenge for modern neural network approaches to textunderstanding.

【45】 Differentiable Representations For Multihop Inference Rules

标题：多跳推理规则的可区分表示

作者： William W. Cohen, Matthew Siegler

链接：https://arxiv.org/abs/1905.10417摘要： We present efficient differentiable implementations of second-order multi-hopreasoning using a large symbolic knowledge base (KB). We introduce a newoperation which can be used to compositionally construct second-order multi-hoptemplates in a neural model, and evaluate a number of alternativeimplementations, with different time and memory trade offs. These techniquesscale to KBs with millions of entities and tens of millions of triples, andlead to simple models with competitive performance on several learning tasksrequiring multi-hop reasoning.

【46】 Using Deep Networks and Transfer Learning to Address Disinformation

标题：使用深度网络和转移学习来解决信息

作者： Numa Dhamani, Jonathon Morgan

备注：AI for Social Good Workshop at the International Conference on Machine Learning, Long Beach, United States (2019)

链接：https://arxiv.org/abs/1905.10412摘要： We apply an ensemble pipeline composed of a character-level convolutionalneural network (CNN) and a long short-term memory (LSTM) as a general tool foraddressing a range of disinformation problems. We also demonstrate the abilityto use this architecture to transfer knowledge from labeled data in one domainto related (supervised and unsupervised) tasks. Character-level neural networksand transfer learning are particularly valuable tools in the disinformationspace because of the messy nature of social media, lack of labeled data, andthe multi-channel tactics of influence campaigns. We demonstrate theireffectiveness in several tasks relevant for detecting disinformation: spamemails, review bombing, political sentiment, and conversation clustering.

【47】 InfoRL: Interpretable Reinforcement Learning using Information Maximization

标题：InfoRL：使用信息最大化进行可解释的强化学习

作者： Aadil Hayat, Vinay P. Namboodiri

链接：https://arxiv.org/abs/1905.10404摘要： Recent advances in reinforcement learning have proved that given anenvironment we can learn to perform a task in that environment if we haveaccess to some form of a reward function (dense, sparse or derived from IRL).But most of the algorithms focus on learning a single best policy to perform agiven set of tasks. In this paper, we focus on an algorithm that learns to notjust perform a task but different ways to perform the same task. As we knowwhen the environment is complex enough there always exists multiple ways toperform a task. We show that using the concept of information maximization itis possible to learn latent codes for discovering multiple ways to perform anygiven task in an environment.

翻译：谷歌翻译**AI时代，拥有个人微信机器人AI助手！AI时代不落人后！**

免费**ChatGPT**问答，办公、写作、生活好得力助手！

搜索微信号**AIGC**666aigc999或上边扫码，即可拥有个人AI助手！