Past talks

 

Miron Livny
University of Wisconsin-Madison

CHTC – Translational Computer Science at Work

The UW-Madison Center for High Throughput Computing (CHTC) is the home of the HTCondor Software
Suite (HTCSS). Located in the Computer Sciences department, the center was established more than 15
years ago on the foundation of a research methodology that brings together innovation in distributed
computing and services to scientists. Evaluation of new technologies under real-life conditions by
engaged users advanced scientific discovery and guided the center in future research and development
activities.
The HTCSS has been serving for almost four decades as a means to place new capabilities in the hands
of researchers who can benefit from advances in Throughput Computing. Having a capable and
dependable suite of software tools enabled adoption of advanced Throughput Computing
methodologies by an international community of users. Abramson and Parashar in their recent
formalization of Translational Research in Computer Science (TCS) identified such an engaged
community as one of the three pillars of the TCS workflow. These diverse adopters also provide the
second pillar of the TCS workflow which is the locale – the place where the new technology is deployed
and evaluated.
The talk will present the different elements that facilitate the translational work of the CHTC. These
include a campus wide and a national research computing environment and a sustained sequence of
software releases. The challenges of sustaining such a center in an academic institution will be discussed
and the opportunities for innovation triggered by ever evolving research computing environments at all
scales will be reviewed.

Bio:

Miron Livny received a B.Sc. degree in Physics and Mathematics in 1975 from the Hebrew University and M.Sc. and Ph.D. degrees in Computer Science from the Weizmann Institute of Science in 1978 and 1984, respectively. Since 1983 he has been on the Computer Sciences Department faculty at the University of Wisconsin-Madison, where he is currently the John P. Morgridge Professor of Computer Science. He serves as the director of the Center for High Throughput Computing (CHTC), is leading the HTCondor Software Suite effort and serves as the technical director of the OSG. He is a member of the scientific leadership team of the Morgridge Institute of Research where he leads the Research Computing theme.

Dr. Livny’s research focuses on distributed processing and data management systems and involves close collaboration with researchers from a wide spectrum of disciplines. He pioneered the area of High Throughput Computing (HTC) and developed frameworks and software tools that have been widely adopted by academic and commercial organizations around the world.

Livny is the recipient of the 2006 ACM SIGMOD Test of Time Award the 2013 HPDC Achievement Award and the 2020 IEEE TCDP Outstanding Technical Achievement Award.

Wednesday, 24.05.23, 11:30
Taub 401
Professor Jason Cong
UCLA
Can We Automate Integrated Circuit Design
As we are near the end of Moore’s Law scaling, there is a great need to design various kinds of customized accelerators for much higher performance and energy efficiency. But integrated circuit (IC) designs remain a black art to many. Given the success of deep learning in many domains in recent years, a natural question is if it is possible to fully automate IC designs. In this talk, we present our latest research on this topic. By coupling high-level synthesis with a set of deep learning techniques, such as graph-based neural networks and meta learning, as well as microarchitecture guided optimization for regular structures, such as systolic arrays and stencil computation, we show that it is possible to automate IC designs so that most software programmers can design their own customized ICs for a wide range of applications.
Bio:
Jason Cong is the Volgenau Chair for Engineering Excellence Professor at the UCLA Computer Science Department (and a former department chair), with joint appointment from the Electrical and Computer Engineering Department. He is the director of Center for Domain-Specific Computing (CDSC) and the director of VLSI Architecture, Synthesis, and Technology (VAST) Laboratory. Dr. Cong’s research interests include novel architectures and compilation for customizable computing, synthesis of VLSI circuits and systems, and quantum computing. He has over 500 publications in these areas, including 16 best paper awards, and 4 papers in the FPGA and Reconfigurable Computing Hall of Fame. He and his former students co-founded AutoESL, which developed the most widely used high-level synthesis tool for FPGAs (renamed to Vivado HLS after Xilinx’s acquisition). He was elected to an IEEE Fellow in 2000, ACM Fellow in 2008, the National Academy of Engineering in 2017, and the National Academy of Inventors in 2020. He is the recipient of the 2022 IEEE Robert Noyce Medal for fundamental contributions to electronic design automation and FPGA design methods.
10/05/2023 at 11:30, Meyer 861

 

Yoav Levi

Quantized Consistent Hashing
Consistent hashing (CH) is a crucial building block for load-balancers. It enables packets of the same flow to keep being mapped to the same server whenever possible. However, papers from the literature and leading load-balancers rely on heuristic algorithms to implement heterogeneous CH, without any ability to provide stability guarantees. This thesis presents a new framework for heterogeneous CH. The framework relies on quantization using virtual servers, and on a min-max fairness-based mapping algorithm.  The main result of this thesis is the establishment of the necessary and sufficient conditions for guaranteeing stability for any arbitrary heterogeneous server service rate. The thesis further demonstrates how limiting the set of allowed service rates enables attaining the maximal stability region for heterogeneous CH. Practical evaluations show that existing solutions suffer from significant stability issues, while the offered solution can leverage its stability guarantees. Throughout the thesis we show that the offered solution has better scalability properties, including a faster key lookup rate and a lower memory footprint.

Yoav is an MSc student under the supervision of Prof. Isaac Keslassy.

Wednesday, 29.03.23, 11:30
Meyer 861

Arad Kotzer

Braess Paradox in Layer-2 Blockchain Payment Networks
Layer-2 is a popular approach to deal with the scalability limitation of blockchain networks. It allows users to execute transactions without committing them to the blockchain by relying on predefined payment channels. Users together with the payment channels form a graph known as the offchain network topology. Transactions between pairs of users without a connecting channel are also supported through a path of multiple channels. Serving such transactions involves fees paid to intermediate users. In this work, we uncover the potential existence of the Braess paradox in payment networks: Sometimes, establishing a new payment channel can increase the fees paid for serving some fixed transactions. We study conditions for the paradox to appear and provide indications for the appearance of the paradox based on real data of Bitcoin’s Lightning, a popular layer-2 network. Last, we discuss methods to mitigate the paradox upon establishing a new payment channel.
Arad is an MSc student supervised by Prof. Ori Rottenstreich.
Wednesday, 22.03.23, 11:30

Taub 401

Shir Cohen
Distributed Services Under Attack

For my PhD thesis seminar, I will be presenting two of my works related to the security and reliability of distributed services in the face of Byzantine attacks. In the first work “Not a COINcidence: Sub-Quadratic Asynchronous Byzantine Agreement WHP” (DISC’20), I present a solution for binary Byzantine Agreement (BA) in asynchronous systems, using a shared coin algorithm based on a VRF and VRF-based committee sampling. My algorithms work against a delayed-adaptive adversary with a word complexity of Õ(n) and O(1) expected time, breaking the O(n²) bit barrier for asynchronous Byzantine Agreement. This work is then used to solve the multivalued version of the BA problem.

In the second work “Tame the Wild with Byzantine Linearizability: Reliable Broadcast, Snapshots, and Asset Transfer” (DISC’21), I introduce the concept of Byzantine linearizability and study Byzantine-tolerant emulations of various objects from registers, including reliable broadcast, atomic snapshot, and asset transfer. This work proves that there is an f-resilient implementation of such objects from registers with n processes for f<n/2. These works, along with my other papers, have broad implications for the security and reliability of distributed systems in the face of adversarial attacks.


Shir is a Ph.D student supervised by Prof. Idit Keidar.

Wednesday, 15.03.23, 11:30
Taub 601

 

Henriette Padberg

RWTH Aachen, Germany

Experimental Demonstration of Non-Stateful Logic with Siliconoxide-based VCM Cells

For processing-in-memory, successful operating of logic functions is cruical. For compability with CMOS-logic, non-stateful logic as well as Scouting logic with an 1T1R array as most used configuration in the industry are promising techniques. This work deals with the experimental characterization of the used SiOx-based VCM cells as well as the realization of the two logic functions. Thus, the used methods, experimental results and challenges are presented.

Bio:

Henriette is an M.Sc student in material science at the RWTH Aachen, Germany, with focus on electronic materials and nanotechnology. She completed her Bachelor thesis about stateful logic with ZrOx-based VCM cells at Prof. Rainer Wasers group. Since October 22 she has been a research intern at the Technion, hosted by Prof. Shahar Kvatinsky.

Wednesday, 08.03.23, 11:30
Meyer 861

 

Oded Naor

Scaling Byzantine Blockchains and Consensus Protocols

In this talk, I will present a line of different works, all aimed at improving the performance of blockchains and Byzantine consensus protocols.

The first part of the talk will present Cogsworth (CES’20) and the NK’20 (DISC’20) view synchronization protocols. View synchronization is the task of bringing all participating validators in the consensus protocol to the same view for sufficiently long to drive progress in the protocol. View synchronization is the performance bottleneck required by various consensus protocols such as HotStuff to improve the overall message complexity.
Cogsworth allows for the first time to solve multi-shot consensus with benign faults at an expected linear message complexity. NK’20 has the same linear performance also in runs with Byzantine failures. Combining NK’20 and HotStuff allows solving consensus for the first time with expected linear message complexity in the eventual synchrony model.

In the second part of the talk, I will present DAG-Rider (PODC’21) and Cordial Miners. DAG-Rider is a blockchain designed for the asynchronous network and is based on validators jointly building a distributed DAG structure. It has optimal amortized linear message complexity and expected constant latency, and is post-quantum safe. DAG-Rider is designed to scale blockchains with many transactions and increase the throughput and work balance among the validators.
Cordial Miners is a family of consensus protocols for asynchrony and eventual synchrony. It reduces the latency of comparable DAG-based protocols (including DAG-Rider) by almost half. It does so by forgoing the use of reliable broadcast as a building block and using an alternative data structure we name “blocklace”. Reliable broadcast allows the elimination of equivocation by Byzantine validators in the DAG but has a rather high latency. Not using reliable broadcast reduces the latency but requires handling equivocations in the blocklace itself.

Oded is a Ph.D student supervised by Prof. Idit Keidar.

Wednesday, 22.02.23, 11:30
Meyer 861

Roei Reshef
Verification of Neural Networks’ Privacy

Many verifiers can prove robustness of neural networks to adversarial or poisoning attacks as well as fairness safety. However, no verifier proves robustness to privacy attacks. We propose a new safety property for privacy attacks, inspired by differential privacy. Our property is satisfied by networks that satisfy at least r⋅100% of a given set of properties P regardless of whether they were trained with the full dataset or whether any single entry is omitted. An optimal algorithm is highly impractical because it involves training a very large number of networks and verifying the properties in P separately for every network. We propose Sphynx, an algorithm that computes an abstraction of all networks, with a high probability, from a small set of networks, and verifies properties directly on the abstract network. The challenge is twofold: network parameters do not adhere to a known distribution probability, making it difficult to predict an abstraction, and predicting too large abstraction harms the verification. Our key idea is to transform the parameters into a distribution given by KDE, allowing to keep the over-approximation error small. To verify properties on the abstract network, we extend a MILP verifier to reason about abstract networks. Our verifier replaces equality constraints with inequalities, leading to over-approximation. To mitigate it, we propose two approaches. Our evaluation shows that Sphynx (1) computes abstract networks obtaining 97.8% coverage by training only 1% networks, (2) verifies 63% of the properties verified by the optimal algorithm, and (3) reduces the analysis time from 2.5 days to 20 minutes.

Roei is an MS.c student under supervision of Dr. Dana Drechsler-Cohen.

Wednesday, 15.02.23, 11:30
Meyer 861
Dr. Nadav Amit
VMware Research

Reexamining the Foundations: Improving Basic OS Memory Management Techniques

Despite significant advancements in operating system memory management, our understanding of the desired behavior of fundamental techniques introduced decades ago is sometimes incomplete or not well-defined. This can result in correctness issues that might cause the system to crash or be compromised, as well as missed opportunities for optimizations. In this talk, I will present two specific examples of this: (1) the inefficiencies in synchronizing the memory view across different CPU cores, and (2) the undefined behavior of the interactions between two common memory management mechanisms, copy-on-write and pinning.

Bio:
Nadav Amit is a senior researcher in VMware Research. He received his PhD in 2014 from the Technion, Israel Institute of Technology for his work on alleviating of virtualization bottlenecks. He is a recipient of the SPEC Distinguished Dissertation Award, IBM Fellowship Award and an honorable mention of the Dennis M. Ritchie Doctoral Dissertation Award. His current research revolves operating systems and virtualization and focuses on memory management.

Wednesday, 08.02.23, 11:30
Meyer 861

Matan Yechieli
 
Low-Latency Blockchains with DAG Holography
 

Classical Proof-of-Work blockchains like Bitcoin implement a decentralized ledger, where anyone can participate. They aggregate transactions from system users in blocks and decide each block’s position in the ledger. They require the block at each position to accrue votes until the probability of a decision change, due to chance or malice, is negligible. To allow consumer usage of such systems, low latency in the order of seconds is necessary. In classical blockchain systems latency is in the order of hours. Recent protocols use parallel voting and reach low latency, but require a prohibitively high overhead bandwidth. 

We present Holograph, a blockchain protocol that achieves a latency of seconds under practical bandwidth limitations. To achieve this we introduce a novel technique called dag (Directed Acyclic Graph) Holography. Holograph participants form a single physical block dag. Each physical block manifests as multiple virtual blocks in multiple virtual dags that serve as parallel voting structures. By viewing the same physical dag as many virtual ones, like a holographic image viewed from different angles, we obtain more votes with the same bandwidth.

We analyze Holograph’s latency as a function of overhead bandwidth limit, compared against prior art. To the best of our knowledge, this is the first such analysis of blockchain protocols. Our simulation shows that Holograph reaches the lowest latency across the tested bandwidth range, improving latency by about 5x compared to the state-of-the-art when bandwidth is limited. We run a prototype implementation in an emulated network reaching a latency of 7 seconds with 10kbps overhead bandwidth, which is 45% lower than the state-of-the-art with an order-of-magnitude lower, practical overhead throughput. 

Matan is an Ms.C Student supervised by Prof. Ittay Eyal.

Wednesday, 01.02.23, 11:30
Taub 401

Mohammad Nassar
CFTO: Communication-aware Fairness in Blockchain Transaction Ordering
Blockchain leader-based protocols elect leaders for proposing the next block of transactions. Proposed blocks need to pass a validation routine in order to be added to the blockchain. Proposers may prioritize certain transactions based on their fees or accounts. A fair block selection follows a random selection of pending transactions among those that a proposer is aware of. Validators may have only partial knowledge of the network transactions, thus making it challenging to validate the random selection. We propose CFTO, a protocol that aims at encouraging fair block selection in a leader-based blockchain network. CFTO offers two main contributions. First, it provides incentives for acting honestly and diminishing malicious and dishonest nodes. To accomplish this, we use a reputation system, whereby each node is given a reputation score based on its actions (i.e., the latest proposals and evaluations). Second, it consists of an algorithm that evaluates the proposed blocks based on the zone structure of the network. Furthermore, we adapt the evaluation algorithm to fit the additional order constraints implied in Ethereum transaction ordering. We demonstrate the improved accuracy of CFTO in detecting fair blocks by implementing experiments based on an Ethereum network and comparing it with Helix, a previously proposed consensus algorithm for fair block selection. As part of our experiments, we also compare certain features with those of previous studies.
Mohammad is an Ms.C student supervised by Prof. Ariel Orda and Ori Rottenstreich.
Monday, 16.01.23, 13:30
Meyer 861
 
 

Ben Perach

Bulk-Bitwise Processing-In-Memory Architecture using Emerging Memory Technologies

In a modern computer system, the discrepancy between the processing unit performance and the main-memory performance, known as the “memory wall,” is a source of significant inefficiency. A high percentage of the system run time and energy is spent on data transactions between the memory and the CPU rather than computation. Different attempts to move the computation physically closer to the memory have been proposed to alleviate this problem. This approach is called processing-in-memory (PIM).

In this seminar, we are interested in a PIM technique named bulk-bitwise PIM, where the memory arrays are processing units operating on the data they hold, thus substantially reducing the data movement. This kind of PIM can perform bit logic operations (e.g., NOR) between memory cell values and write the result directly to a memory cell. Furthermore, due to the regular structure of the memory array, these logic bit operations can be performed in parallel on many rows or columns, making a memory array a vector bitwise processing unit.

This topic has been previously mainly researched at the memory technology level or as a processing unit in an accelerator architecture (i.e., a single chip with a single function like matrix multiplication or DNN inference). In this seminar, we show our investigation of bulk-bitwise PIM architecture as a computing system’s main memory, finding the strengths and weaknesses of such architectures, what kind of operations it is suitable for, and how it should be used. To achieve this goal, my research suggests solutions for various architectural subjects, such as programming models, virtual memory support, coherency, and consistency. The research focuses on emerging memory technologies and suggests a micro-architecture for the PIM memory module. The work also takes a test case application, relational database analytical processing, to investigate how bulk-bitwise PIM can be used.

Ben is a PhD student supervised by Prof. Shahar Kvatinsky.
Wednesday, 11.01.23, 11:30
Meyer 861
 
 
Igor De-Paula 
NeuroLPM – Scaling Longest Prefix Match Hardware with Neural Networks
Longest Prefix Match engines (LPM) are broadly used in computer systems and especially in modern network devices such as Network Interface Cards, switches and routers. However, existing LPM hardware fails to scale to millions of rules required by modern applications, is often optimized for specific applications, and thus is performance-sensitive to the structure of LPM rules. 
We describe NeuroLPM, a new architecture for multi-purpose LPM hardware that replaces lookups in traditional memory- intensive trie- and hash-table data structures with inference in a lightweight Neural Network-based model, called RQRMI. Neu- roLPM supports scaling to millions of rules under small on-die SRAM budget and achieves stable, input-agnostic performance, allowing its use in various applications. We solve several unique challenges when implementing RQRMI inference in hardware, including the elimination of floating-point computations while maintaining query correctness, and scaling to larger inputs while ensuring small, deterministic off-chip memory bandwidth.
We prototype NeuroLPM in System Verilog and evaluate it on real-world packet forwarding rule-sets and network traces. We demonstrate that NeuroLPM offers substantial scalability benefits. For example, it efficiently serves a 950K -large ruleset with only 2MB of SRAM, and reduces the DRAM bandwidth per query, the dominant performance factor, by up to 9× and 3× compared to the state-of-the-art Tree Bitmap and SAIL, even though it is general and lacks network forwarding-specific optimizations that they have.
Igor is an Ms.C student supervised by Prof. Mark Silberstein.
Wednesday, 04.01.23, 11:30
Meyer 861
 
 
Ben Nassi
BGU

Finding Darkness in the Light: Recovering Speech and Cryptographic keys from Light Emitted from Power LEDs and Light Bulbs.

In this talk, I will present a journey that started three years ago at the intersection between light leakage and information confidentiality. In the first part of the talk, I will present the topic of electro-optical speech eavesdropping which is based on three methods we developed to recover speech from light emitted from light bulbs (Lamphone USENIX Security 22), power LEDs of speakers (Glowworm Attack CCS 21), and from light reflected from shiny ornaments and objects (The Little Seal Bug – BH Asia 22). We will discuss the threat model and its significance with respect to related works, and hear speech recoveries from light measurements (obtained by a photodiode) 25-35 meters away.

In the second part of the talk, I will present visual cryptanalysis, a new method to recover secret keys of three different cryptosystems (RSA, ECDSA, and SIKE) by obtaining optical traces from power LEDs of various devices (Galaxy S8, card reader, Raspberry Pi) that run common cryptographic libraries (GnuPG, Libgcryipt, and PQCrypto-SIDH), using a photodiode located 25 meters away. We will discuss the new threat model and its significance with respect to related works, and understand the origin of the vulnerability.

At the end of the talk, I will present an ECDSA key recovery of a smart card using a video taken by a security camera placed 2.5 meters from the power LED of a card reader by exploiting the rolling shutter of video cameras.

Bio: Dr. Ben Nassi is a postdoctoral researcher in the Software and Information Systems Engineering department at the Ben-Gurion University of the Negev and soon to be a postdoctoral researcher at Cornell Tech. He is interested in building robust systems and investigating the security and privacy of cyber-physical systems (drones, semi-autonomous cars) and hardware/devices (microcontrollers, smartphones, smart cards) in the topics of side-channel attacks, AI robustness, and applied cryptography using signal processing and machine learning techniques. His research has been presented at top academic conferences (S&P, CCS, USENIX Security, UbiComp) published in journals (IEEE TIFS, MDPI Sensors) and Magazines (IEEE Computer, Communications of the ACM), and covered by international media (Wired, ArsTechnica, Motherboard, Forbes, The Washington Post, Bloomberg, Business Insider). Ben has spoken at prestigious industrial conferences (Black Hat Asia and USA, RSAC USA, AI Week, CodeBlue, SecTor, and CyberTech) and he serves as a PC member in ACM CCS (22 and 23) and BlackHat (22 and 23). His research entitled him to the BGU Dean Award for excellence in Ph.D., and two nominations for the Pwnie Award.

Wednesday, 28.12.22, 11:30

Meyer 861

Ramy Fakhoury

Nova: safe off-heap memory allocation and reclamation
In recent years, we begin to see Java-based systems embrace off-heap allocation for their big data demands. As of today, these system rely on simple ad-hoc garbage-collection solutions, which restrict the usage of off-heap data. This paper introduces the abstraction of safe off-heap memory allocation and reclamation (SOMAR), a thread-safe memory allocation and reclamation scheme for off-heap data in managed environments. SOMAR allows multi-threaded Java programs to use off-heap memory seamlessly. To realize this abstraction, we present Nova, Novel Off-heap Versioned Allocator, a lock-free SOMAR implementation. Our experiments show that Nova can be used to store off-heap data in Java data structures with better performance than ones managed by Java’s automatic GC. We further integrate Nova into the open-source Oak concurrent map library, allowing it to reclaim keys while the data structure is being accessed.
The talk will be given in Hebrew.
Ramy is an Ms.C student supervised by Prof. Idit Keidar.
Wednesday, 14.12.22, 11:30
Meyer 861

George Varghese, UCLA

Formal Methods for a Robust Network EcosystemNetwork verification, applying formal methods to verify properties of router configurations, is already mainstream with startups like Forward and Veriflow Networks, and divisions in established companies such as Amazon’s ARG and Cisco’s Candid.  In this talk, I will survey what remains to be done including: extending formal methods to implementations, and to other parts of the network ecosystem besides routing.  I will illustrate these points with recent work we have done on improving the robustness of the Domain Name Service (DNS).  Notable errors in DNS have rendered popular services such as GitHub, Twitter, and HBO inaccessible for extended periods.First, I will present GRoot (SIGCOMM 2020, Best Student Paper), a new verification tool that performs exhaustive and proactive static analysis of DNS configuration files (zone files) based on an efficient algorithm to determine the equivalence class of all possible queries to guarantee key correctness properties. Next, I will describe a new technique, SCALE (NSDI 2022), for finding RFC compliance errors in DNS nameserver implementations via symbolic execution of a DNS formal model to jointly generate test queries and zone files. Using SCALE, we identified 30 new bugs in 8 popular open-source DNS implementations such as BIND, PowerDNS, KNOT, and NSD, including 3 previously unknown critical security vulnerabilities.This talk is based on Siva Kakarla’s Ph.D. work at UCLA and is joint work with Ryan Beckett, Behnaz Armani at MSR , and Todd Millstein at UCLA.
Biography:  George Varghese is the Jonathan B. Postel Professor of Networking in the Computer Science department at UCLA.  He was elected to the National Academy of Engineering in 2017, to the National Academy of Inventors in 2020, and to the Internet Hall of Fame in 2021, and to the American Academy of Arts and Sciences in 2022.
Monday, December 12th, 14:30

Meyer 1003
YouTube link

Yoav Etsion

Accelerating Big Data Analytics with the Speedata Analytics Processing Unit (APU)At a time when Moore’s Law is reaching its end, the volume of data created, curated, and consumed worldwide grows exponentially. As a result, general-purpose processors are struggling to keep up with the throughput demand of big data analytics workloads.In this talk I will present Speedata and the Analytics Processing Unit (APU), the first hardware accelerator for big data analytics. Speedata’s APU accelerates existing software frameworks (e.g., ApacheSpark, Apache Presto) and improves performance/cost by orders-of-magnitude while maintaining users’ existing code base.
I will present how the APU’s programmable architecture supports high-bandwidth processing of complex analytics queries. And after the unavoidable technical part, I will try to share some of myinsights on the crazy journey of starting a semiconductor company.
Wednesday, 07.12.22, 11:30
Meyer 861

Issa Salameh

Superconductive Logic Circuits With Half Flux Quantum Pulses

Conventional CMOS technology has experienced exponential growth over the past several decades.

The scaling of the transistor has enabled improvement in performance while maintaining low energy dissipation.
However, their scaling is reaching a fundamental physical limitation, encouraging a post CMOS technology to replace it in certain areas and applications.
Single flux quantum (SFQ) logic based on Josephson junctions (JJ) is about five orders of magnitude more energy efficient than CMOS logic while working in tens and hundreds of GHz.
This technology is still far behind CMOS in terms of integration density and complexity.
One major limiting factor in increasing integration density of the technology is the limited scalability of inductors.
In this seminar, two works are presented, the first work introduces a novel device with a data representation of half flux quantum instead of single flux quantum, which leads to a complete inductor-less logic gates.
We design two functionally complete logic gates using this data representation and demonstrate that they can be faster and more energy efficient than conventional rapid single flux quantum (RSFQ) circuits and are more scalable.

In the second work, we present a design methodology for a logic technique that contains the novel JJ, and a mapping algorithm which automatically converts an RSFQ device netlist, to a device netlist containing the novel JJ as presented in the first work and a completely inductor-less logic gates.

Issa is an Ms.C student supervised by Prof. Shahar Kvatinsky.
Tuesday, 06.12.22, 10:30
Meyer 861

Raissa Nataf

Null Messages, Information and Coordination.
This work investigates the transfer of information in fault-prone synchronous systems using null messages.  The notion of an f-resilient message block is defined to capture the fundamental communication pattern for knowledge transfer. This pattern may involve null messages in addition to explicit messages, and hence, it provides a fault-tolerant extension of the classic notion of a message-chain. Based on the above, we provide tight necessary and sufficient characterizations of the generalized  communication patterns, including actual messages and null messages, that can serve to solve the distributed tasks of (nice-run) Signaling and Ordered Response.
Raissa is an Ms.C student supervised by Prof. Yoram Moses.
Wednesday, 23.11.22, 11:30
Taub 401

Jongeun Lee

Ulsan National Institute of Science and Technology (UNIST), Ulsan, Korea.

Challenges with Analog In-Memory Computing Arrays for Efficient Deep Learning Acceleration

ReRAM (Resistive Random-Access Memory) crossbar arrays have the potential to provide extremely efficient matrix-vector multiplication (MVM) operations, which are the cornerstone of many DNN (Deep Neural Network) applications. However, there are several challenges in order for ReRAM crossbar arrays (RCAs) to be useful for accelerating large-scale DNN applications. In this talk we discuss two of those challenges. The first one is the distortion in the MVM output of RCAs due to nonidealities such as wire resistance (also known as IR drop) and I-V nonlinearity (i.e., voltage-dependent conductance). While it may be very difficult to completely eliminate RCA nonidealities, a fast method for system architects to accurately predict the output of RCAs under nonidealities would be highly desirable. We discuss recent methods in this direction and possible use cases. The second challenge is the high peripheral circuit overhead of RCAs. In particular, ADCs (Analog-Digital Converters) can account for a lion’s share in both area and power dissipation of RCAs. I will present a quantization-based method, which can reduce the ADC overhead of ReRAM crossbar arrays significantly (32x compared with ISAAC for ResNet on ImageNet dataset) at minimal accuracy loss (of 0.24% compared with the same).
Bio
Jongeun Lee received his B.Sc. and M.Sc. degrees in electrical engineering, and his Ph.D. in electrical engineering and computer science, all from Seoul National University, Korea. In 2009, he joined Ulsan National Institute of Science and Technology (UNIST), Ulsan, Korea, where he is now a Professor at the Department of Electrical Engineering. Prior to joining UNIST, he worked as a postdoctoral research associate at Arizona State University, Tempe, Arizona, USA, and for Samsung Electronics, Korea.
He has published more than 80 peer-reviewed journal and conference papers and has been on the technical program committees and organizing committees of several workshops and conferences in the areas of computer-aided design and reconfigurable computing. His research interests are in the general area of hardware/software co-design, and include reconfigurable architectures, deep learning, and computer-aided design for emerging technologies.

 

YouTube link

Moshe (Mickey) Gabel

York University

on:

Starlight: Fast Container Provisioning on the Edge and over the WAN

Containers, originally designed for cloud environments, are increasingly popular for provisioning workers outside the cloud, for example in mobile and edge computing. These settings, however, bring new challenges: high latency links, limited bandwidth, and resource-constrained workers. The result is longer provisioning times when deploying new workers or updating existing ones, much of it due to network traffic.

Our analysis shows that current piecemeal approaches to reducing provisioning time are not always sufficient, and can even make things worse as round-trip times grow. Rather, we find that the very same layer-based structure that makes containers easy to develop and use also makes it more difficult to optimize deployment. Addressing this issue thus requires rethinking the container deployment pipeline as a whole.

Based on our findings, I will present Starlight: an accelerator for container provisioning. Starlight decouples provisioning from development by redesigning the container deployment protocol, filesystem, and image storage format. Our evaluation using 21 popular containers shows that, on average, Starlight deploys and starts containers 3x faster than the current industry standard implementation while incurring no runtime overhead and negligible storage overhead. Finally, it requires no changes to the deployed application, is backwards compatible with existing workers, and uses standard container registries.

Starlight is open source and available at https://github.com/mc256/starlight.

Bio

Moshe (Mickey) Gabel is an assistant professor in the Department of Electrical Engineering and Computer Science at York University. Before joining York, he spent four years as a limited-term assistant professor  in the Department of Computer Science at the University of Toronto. He earned his PhD in Computer  Science from the Technion – Israel Institute of Technology, where he also got his MSc and BSc.

Moshe’s research lies in the intersection of distributed algorithms, systems, and machine learning. His  current research interest is edge computing, specifically making geo-distributed data analysis more  practical and accessible to typical software developers. He has also worked extensively on machine  learning applications in pervasive health monitoring and in computer systems. Moshe’s work appeared in  top venues for systems and data science, including SIGMOD, NSDI, SIGKDD, VLDB, and ICML.

Marcel Khalifa

on:

Acceleration of Bioinformatics Applications Using Memristive-based PIM Architecture

Genomics, the study of genes, is making it possible to predict, diagnose, and treat diseases more precisely and personally than ever. Genome projects typically involve three main phases: DNA sequencing, assembly of DNA to represent the original chromosome, and analysis of the representation.

The recent advances in technology that allow high throughput genomic sequencing to be undertaken quickly and relatively cheaply have propelled the work of genome assembly and genome analysis forward.

Both DNA sequence assembly and analysis are data-intensive applications.

Thus, the regular von Neumann architecture, in which the memory and the computation units are separated, limits the performance of such applications as it demands massive data traffic between the memory and the CPU.

To reduce data movement and eliminate this limitation, processing-in-memory (PIM) platforms have been considered.

This talk presents two memristive PIM accelerators for both:

(1) DNA sequence assembly; specifically, for the read alignment task. the first PIM-based accelerator for the base-count filter is presented.

(2) DNA sequence analysis; specifically, the sequence classification task. We introduce the first scalable PIM-based accelerator for edit-distance-tolerant classification, that is also the first taxonomic classification solution based on memristive memory.

Both accelerators utilize the reduction in data transfer together with the parallel computation the PIM platform offers to accelerate the algorithms.

Marcel is an Ms.C student supervised by Prof. Shahar Kvatinsky.

YouTube link

Shai Bergman

On:

Memory Management for Emerging Memory Technologies in Future Datacenters

Datacenter operators are faced with a formidable challenge of hosting data-intensive online services in a cost-effective manner. In addition to the monetary costs, the increasingly large working-sets of applications severely strain the overheads of memory address translation in virtual machines, leading to poor performance. Emerging memory technologies, such as disaggregated memory and low-latency ZNS flash SSDs show great promise for delivering high memory capacity at a lower cost. However, utilizing these new memory paradigms efficiently pose several memory management challenges due to their inherent characteristic differences from traditional system memory. In the context of virtualization, emerging technologies originally designed  to bolster security, such as Intel’s TDX and AMD’s SEV-SNP facilitate the basic hardware mechanisms to circumvent memory virtualization overheads. However, to fully eliminate memory virtualization translation overheads, thus achieving near bare-metal performance, a new memory virtualization model must be adopted. This talk presents new system solutions to support the aforementioned emerging memory technologies: first, we revisit traditional OS memory management optimizations in the presence of disaggregated memory and introduce HotBox, a new OS memory management subsystem for disaggregated memory. Second, we devise a novel high-performance OS swap subsystem optimized for ZNS SSDs. Finally, we introduce a new virtualization model, translation pass-through, that utilizes new CPU technologies to eliminate memory translation overheads. Shai Bergman is a PhD student under the supervision of Mark Silberstein

Shaked Elias-Zada

on:

Quancurrent: A Concurrent Quantiles Sketch

Sketches are a family of streaming algorithms widely used in the world of big data to perform fast, real-time analytics. A popular sketch type is Quantiles, which estimates the data distribution of a large input stream.

We present Quancurrent, a highly scalable concurrent Quantiles sketch. Quancurrent’s throughput increases linearly with the number of available threads, and with 32 threads, it reaches an update speedup of 12x and a query speedup of 30x over a sequential sketch. Quancurrent allows queries to occur concurrently with updates and achieves an order of magnitude better query freshness than existing scalable solutions.

YouTube Link

*Based on joint work with Arik Rinberg and Idit Keidar.

Lev Rosenblit

On:

GPUknock:  GPU-Timing Covert Channel

GPUs are increasingly utilized as a computational resource on personal machines and cloud infrastructure. We present GPUknock, a GPU-only covert-channel that allows for malicious logic to stealthily utilize a communications channel which is undetectable, runs entirely on GPUs and communicates directly from the GPU with remote machines. GPUknock exploitsunique and inherent properties of GPU hardware architecture schedulers, as well as the kernel execution scheduler of the software stack to provide a robust bi-directional channel. It exploits the predictability of non-preemptive GPU schedulers as well as the internal scheduling policies used by block schedulers, to construct the covert channel between the victim GPU, via WebGL in the web browser, to a malicious remote web server.
The GPU characteristics leveraged by GPUknock are pervasively available in Linux, Windows, iOS and Android, as well as on mobile phones and desktops with discrete and integrated GPUs from most major hardware vendors. In this research we describe and analyze the covert channel, we model the different scheduler hardware, and use the defined policies to design twoimplementation: one that prefers bandwidth but is less resilient to noise and another that is resilient to noise on expense of bandwidth, and we combine the design of both into a single covert channel.
We implement a few real samples of GPUknock. For example, we augment the existing GPU-based key logger to exfiltrate the keystrokes to a remote server in real time, leaving no traces in GPU memory. Our work shows that GPUs can be exploited to dramatically increase the stealthiness of malicious code and enable long-running malware which remains undetected. We thus urge GPU vendors and anti-malware developers to take practical steps to mitigate these real threats in the GPU scheduler logic.
Lev is a Master student under the supervision of Mark Silberstein.

Micha Dery

The Andrew and Erna Viterbi Faculty of Electrical & Computer Engineering

On:

QueuePilot: Reviving Small Buffers With a Learned AQM Policy

Internet router buffers help deal with congestion and traffic variability. Reducing buffer sizes may be one of the main outstanding open problems in networking, as small buffers would provide lower delays for users and free capacity for vendors.

Unfortunately, with small buffers, a passive policy has an excessive loss rate and existing AQM (active queue management) policies, which signal developing congestion, can be unreliable.

In this talk, we introduce QueuePilot, a reinforcement learning-based AQM that enables small buffers in backbone routers, trading off high utilization with low loss rate and short delay. QueuePilot automatically tunes the probability of marking packets to signal congestion.

After training once offline with a variety of settings, QueuePilot produces a single lightweight policy that can be applied online without further learning. We evaluate QueuePilot on real networks with hundreds of TCP flows, and show how its performance in small buffers exceeds that of existing algorithms, and even exceeds their performance with larger buffers.

Arik Rinberg

Technion

On:

Concurrent Sketching: Theory and Practice

Data sketching algorithms have become an indispensable tool for high-speed computations over massive datasets, answering queries (e.g., how many unique elements there are) using limited memory, at the cost of giving approximations rather than exact answers. Existing libraries provide highly optimized sketches, but do not allow parallelism for creating sketches using multiple threads, or querying them while they are being built. In this talk, I will first present a generic approach to parallelizing data sketches efficiently, such that they can be both built concurrently and queried in real-time, while bounding the error that such parallelism introduces. The algorithm’s correctness is based upon relaxed semantics and the notion of strong linearizability. However, depending on the sketch, the algorithm’s performance can be sub-optimal, as linearizability can be restrictive. I will then propose a new correctness criterion, Intermediate Value Linearizability (IVL), which is a relaxation of linearizability that is a natural fit for data sketches. I will show a straightforward concurrent implementation of a common sketch and show that such a straightforward implementation is not linearizable. Finally, I will discuss IVL’s other aspects, and show why IVL is a natural fit for parallelizing data sketches.

*PhD student under the supervision of Prof. Idit Keidar.

YouTube link

Ido Yehezkel

Technion

On:

Hybrid Routing with KNERL: Key Node Enhanced Reinforcement Learning

The growth of communication networks, as well as their inherent dynamics, call for scalable routing solutions that can deal with varying traffic. In this work, we introduce Key Node Enhanced Reinforcement Learning (KNERL), a hybrid routing approach that combines RL optimization for destination-based routing with “classical” optimization that is applied on source-destination routing. KNERL employs RL through a destination-based routing scheme over the vast majority of network nodes, whereas source-destination routing is employed only within a small selected set of “key” nodes. KNERL is, therefore, able to strike a balance between the technical budget, as captured by the number of key nodes, and performance in the data-driven setting. In this talk, we will overview the theoretical motivation for hybrid routing solutions. We will demonstrate that, with just a modest number of key nodes, we significantly outperform both “vanilla” RL as well as oblivious routing in various traffic distributions and network topologies.

*MSc student under the supervision of Prof. Ariel Orda and Dr. Aviv Tamar.

Kfir Toledo

Technion

On:

Dynamic routing: from the cloud to SD-WAN

In this talk, I present two topics: First, I introduce CloudPilot, a Kubernetes-based system that measures communication parameters across different cloud regions, and deploys TCP-split proxies in optimized locations on multiple cloud providers. Then, I present 2SYN, a lightweight algorithm that can dynamically select a preferred path in SD-WAN (software-defined WAN), for example to choose whether it is better to browse using DSL or LTE.

*MSc student under the supervision of Prof.Isaac Keslassy.

Click here to watch the recorded talk.

Oskar Mencer

Maxeler Technologies

On:

Conflict and Technology, AI Chip Wars from the inside out

The microprocessor is 50 years old. 50 years ago, a single ALU at kHz speeds had to be shared by multiple tasks, multiple applications, multiple users and multiple organizations. Due to transistor scaling we can now have 1M ALUs on a chip at GHz speeds. In this talk I will swap the cause and effect equation, instead of talking about solutions to problems, I will talk about problems created by solutions. Today data movement dominates compute time. I will describe how by optimally solving the data flow challenge, we are introducing a conflict with the past, the present and the future; conflicts with science, technology, legal and business structures; conflicts with FLOPS, complexity theory, and supply chains. Most recently, AI chips started forming the backbone of a concerted effort around the world, to break the dead end position of the classical computing industry, and show multiple ways forward.

Bio: Oskar Mencer is CEO of Maxeler Technologies, which has been recently acquired by Groq, an AI chip company, to join forces in building the future of computing. Oskar is a Technion graduate who grew up in the Parallel Systems Lab, class of ‘94, went on to Stanford, the Unix group at Bell Labs, Imperial College London, and more recently, became a member of Academia Europaea, and the Technion Board of Governors.

On 8/06/2022 at 11:30

Click here to watch the recorded talk.

 

Guy Goren

Technion

On:

Non-explicit Information Exchange in Distributed Computing

Distributed systems become more and more ubiquitous. These systems entail different instances of distributed computing problems, each with unique characteristics. In this talk, I will discuss my Ph.D. research on theoretical and algorithmic aspects of distributed computing. Specifically, I will focus on two works that represent different aspects of my research on non-explicit information exchange in distributed systems. I will start with a fundamental work on concepts of information transfer in a distributed fault-prone environment and present tight bounds for the Atomic Commit problem by using the concept of a silent choir. Then, I will discuss my work on load balancing in the parallel server model with multiple dispatchers and present new solutions. These solutions are designed from a distributed computing perspective without explicit communication and instead, they utilize stochastic coordination.*PhD student under the supervision of Prof. Yoram Moses.

On 18/05/2022 at 11:30

Click here to watch the recorded talk.

Itay Tsabary

Technion

On:

Design of Incentive-Based Blockchain Systems

Bitcoin, Ethereum, and other blockchain-based systems have accumulated a market cap of billions of dollars. These systems’ security relies on them successfully incentivizing their participants to act as intended; failing to do so might result in safety and liveness violations. Our research focuses on the design and analysis of these incentive mechanisms.  In this talk, I will discuss several of our research results. I will focus on a vulnerability of a prevalent smart contract, its security implications, and present our secure alternative. I will then discuss an incentive misalignment arising from inconstant rewards, and will conclude with a novel blockchain design that significantly reduces the system’s wastefulness.*PhD student under the supervision of Dr. Ittay Eyal.

On 27/04/2022 at 11:30

Click here to watch the recorded talk.

Yossi Gilad

The Hebrew University of Jerusalem

On:

Twilight: A Differentially Private Payment

Payment channel networks (PCNs) provide a faster and cheaper alternative to transactions recorded on the blockchain. Clients can trustlessly establish payment channels with relays by locking coins and then send signed payments that shift coin balances over the network’s channels. Although payments are never published, anyone can track a client’s payment by monitoring changes in coin balances over the network’s channels. We present Twilight, the first PCN that provides a rigorous differential privacy guarantee to its users. Relays in Twilight run a noisy payment processing mechanism that hides the payments they carry. This mechanism increases the relay’s cost, so Twilight combats selfish relays that wish to avoid it using a trusted execution environment (TEE) that ensures they follow its protocol. The TEE does not store the channel’s state, which minimizes the trusted computing base. Crucially, Twilight ensures that even if a relay breaks the TEE’s security, it cannot break the integrity of the PCN. We analyze Twilight in terms of privacy and cost and study the trade-off between them. We implement Twilight using Intel’s SGX framework and evaluate its performance using relays deployed on two continents. We show that a route consisting of 4 relays handles 820 payments/sec.

Bio:Yossi Gilad is a Harry & Abe Sherman senior lecturer at the School of Computer Science and Engineering, the Hebrew University of Jerusalem. Prior to the Hebrew University, he was postdoctoral researcher at the Massachusetts Institute of Technology and Boston University. His research focuses on designing, building, and analyzing secure and scalable protocols and networked systems.

On 25/04/2022 at 11:30

Preview of three NSDI’22 papers from Accelerated Computing Systems Lab

1) Alon Rashelbach – Scaling Open vSwitch with a Computational Cache

Open vSwitch (OVS) is a widely used open-source virtual switch implementation. In this work, we seek to scale up OVS to support hundreds of thousands of OpenFlow rules by accelerating the core component of its data-path – the packet classification mechanism. To do so we use NuevoMatch, a recent algorithm that uses neural network inference to match packets, and promises significant scalability and performance benefits. We overcome the primary algorithmic challenge of the slow rule update rate in the vanilla NuevoMatch, speeding it up by over three orders of magnitude. This improvement enables two design options to integrate NuevoMatch with OVS: (1) using it as an extra caching layer in front of OVS’s megaflow cache, and (2) using it to completely replace OVS’s data-path while performing classification directly on OpenFlow rules, and obviating control-path upcalls. Our comprehensive evaluation on real-world packet traces and between 1K to 500K ClassBench rules demonstrates the geometric mean speedups of 1.9 x and 12.3 x for the first and second designs, respectively, with the latter also supporting up to 60K OpenFlow rule updates/second, by far exceeding the original OVS.

*Joint work with Ori Rottenstreich and Mark Silberstein

2)Lior Zeno – SwiSh: Distributed Shared State Abstractions for Programmable Switches

We design and evaluate SwiSh, a distributed shared state management layer for data-plane P4 programs. SwiSh enables running scalable stateful distributed network functions on programmable switches entirely in the data-plane. We explore several schemes to build a shared variable abstraction, which differ in consistency, performance, and in-switch implementation complexity.We introduce the novel Strong Delayed-Writes (SDW) protocol which offers consistent snapshots of shared data-plane objects with semantics known as strong r-relaxed linearizability, enabling implementation of distributed concurrent sketches with precise error bounds. We implement strong, eventual, and SDW consistency protocols in Tofino switches, and compare their performance in microbenchmarks and three realistic network functions, NAT, DDoS detector, and rate limiter. Our results demonstrate that the general distributed state management in the data plane is practical, and outperforms any centralized solution by up to four orders of magnitude in update throughput and replication latency.*Joint work with Dan Ports, Jacob Nelson, Daehyeok Kim, Shir Landau Feibish, Idit Keidar, Arik Rinberg, Alon Rashelbach, Igor De-Paula and Mark Silberstein

3) Haggai Eran – An edge-queued datagram service for all data-center traffic

Modern datacenters support a wide range of protocols and in-network switch enhancements aimed at improving performance. Unfortunately, most new and legacy protocols and enhancements often don’t coexist gracefully because they inevitably interact via queuing in the network. In this paper we describe EQDS, a new datagram service for datacenters that moves almost all of the queuing out of the core network and into the sending host. This enables it to support multiple (conflicting) higher layer protocols, while only sending packets into the network when decided by a receiver-driven credit scheme. EQDS can speed-up legacy TCP and RDMA stacks and enable transport protocol evolution, while benefiting from future switch enhancements without needing to modify higher layer stacks. We show through simulation and multiple implementations that EQDS can reduce FCT of legacy TCP by 2x, improve the NVMeOF throughput by 30%, and safely run TCP alongside RDMA on the same network.

*Joint work with Costin Raiciu, Vladimir Olteanu, Adrian Popa, Mark Handley, Mark Silberstein, Dragos Dumitrescu, Cristi Baciu, Georgios Nikolaidis

On 30/03/2022 at 11:30

Click here to watch the recorded talk.

 

Natan Peled

Technion

On:

A Flexible Mapping Scheme for Wear-Leveling in PersistentMemories

New non-volatile memory technologies show great promise for extending system memory beyond the existing architectures, but have limited endurance that needs to be mitigated toward their reliable use as main-memory extensions. Wear-leveling is a common technique for prolonging the life of endurance-limited memory, by preventing a situation where some parts of the memory reach their endurance limit, while other parts remain much less used for writing. Existing wear-leveling approaches either employ costly full-indirection mapping between logical and physical addresses, or choose simple mappings that cannot cope with extremely unbalanced write workloads. In this work, we propose a new wear-leveling architecture that can level even the most unbalanced adversarial workloads, while enjoying low mapping complexity compared to full indirection. The key idea is to use a family of efficiently computable mapping functions allowing to selectively remap heavily written addresses, while keeping the associated mapping costs well below existing flexible mappings.

*MSc student under supervision of Prof. Yuval Cassuto

On 23/03/2022 at 11:30

 

 

Idan Raz

Technion

On:

Model-Based Simulation for SMT Cores

Studies that evaluate new architectural designs of virtual memory typically employ a “model-based’’ methodology that relies on simulations of the translation lookaside buffer (TLB) coupled with empirical performance models. We observe that this methodology is limited in that each simulated thread of execution has its own dedicated TLB, whereas modern processors share a single TLB among multiple threads through “simultaneous multithreading’’ (SMT). Existing model-based research is thus unable to explore virtual memory designs in SMT context. We address this problem: (1) by showing that the behavior of different multiprogrammed thread combinations varies over time nontrivially, and by introducing a systematic approach for measuring this behavior with bounded error; (2) by developing a TLB simulator capable of realistically combining multiple memory-reference streams (of the SMT threads) into one; (3) by validating the simulator’s accuracy against real (Intel) processors to ensure the correctness of our approach, which required us to reverse engineer their TLB eviction policy; and (4) by showing how to build empirical models that predict runtimes of different SMT combinations from their combined simulated TLB miss rate. We demonstrate our methodology’s usefulness by evaluating a new TLB partitioning mechanism for SMT processor cores.

*MSc student under supervision of prof. Dan Tsafrir

On 10/03/2022 at 14:30

 

Maxim Barsky

Technion

On:

TLB Partitioning for SMT Systems

Simultaneous multithreading (SMT) increases the cost of memory address translation due to sharing of the translation lookaside buffer (TLB) among multiple threads. Current x86 processors use a “competitively-shared’’ TLB, in which low-locality threads might needlessly waste TLB resources and thus degrade the performance of neighboring high-locality threads. To address this problem, we introduce PartTLB, a new mechanism that: (1) samples the TLB requests of the competing threads, (2) predicts their miss rate with different TLB sizes, and (3) partitions the TLB accordingly to minimize the overall miss rate. PartTLB requires small additional hardware (about 1% of the TLB size) and no software invo​lvement. For a 2-way SMT, PartTLB improves performance by up to 37% for one thread and worsens performance by up to 3% for the other thread.

*MSc student under supervision of prof. Dan Tsafrir

On 10/03/2022 at 15:30

Dr. Farhad Merchant

Aachen University

On:

Embedded Systems and Security

Protecting intellectual properties from untrusted design houses and foundries has become highly challenging. In this talk, I will focus on the security aspects of embedded systems. First, I will discuss the logic locking tools and attack methods developed at the Institute for Communication Technologies and Embedded Systems, RWTH Aachen University. In the second part of the talk, I will focus on developing low-power, high-performance embedded systems based on emerging non-volatile memory technologies. These methods incorporate the non-Von Neumann computing model, where the computations are performed inside memory to reduce the data movements. Finally, I will touch upon the security aspects of emerging technologies.

Bio: Farhad Merchant received his Ph.D. from the Indian Institute of Science, Bangalore (India), in 2016. His Ph.D. thesis title was “Algorithm-Architecture Co-design for Dense Linear Algebra Computations.” He received the DAAD fellowship during his PhD. He worked as a postdoctoral research fellow at Nanyang Technological University (NTU), Singapore, from March 2016 to December 2016. In December 2016, he moved to Corporate Research in Robert Bosch in Bangalore as a Researcher, where he worked on numerical methods for ordinary differential equations. He joined Institute for Communication Technologies and Embedded Systems, RWTH Aachen University, in December 2017 as a postdoctoral research fellow in the Chair for Software for Systems on Silicon. Farhad is the recipient of the HiPEAC technology transfer award in 2019. His research interests are hardware security and neuromorphic computing.

On 02/03/2022 at 11:30

Click here to watch the recorded talk.

 

 

Haggai Eran

Technion

On:

SmartNIC Inline Processing

The inline processing technique enables data transformation as a system transfers data to or from a processing node. It is used to offload computations and accelerate data-intensive communication tasks, reducing latency and power due to data movement and improving throughput by using the best processing core for the job. However, inline processing poses several challenges: it breaks existing operating system and network stack layers and makes it difficult to reuse previous software and hardware. This talk presents new operating system abstractions to allow application layer inline processing on SmartNICs and new SmartNIC designs, enabling fine-grain hardware virtualization and reusing existing ASIC NIC functionality with FPGA-based SmartNICs. In addition, I will present a complementary line of research that uses SmartNICs to accelerate data-center networks transparently and enable sharing of competing network transports.

*PhD student under supervision of Prof. Mark Silberstein

On 26/01/2022 at 11:30

Click here to watch the recorded talk.

 

 

Avishai Wool

Tel-Aviv University

On:

Securing legacy communication buses: industrial control systems, in-vehicle and in-aircraft networks

Many important networking systems were designed decades ago, with a “closed environment” as a fundamental invariant: the networking infrastructure in a moving car, a flying aircraft, or a fenced power plant, were implicitly assumed to be isolated. As a result, the communication bus protocols were designed to function well despite natural phenomena such as noise, interference, radiation and so forth. No defenses against malicious adversaries were designed in.Once these isolated systems are connected to the Internet, the old design choices are exposed, and become easy attack surfaces. And then we find that the legacy networks are victims of their own success: Replacing the networking technology with secure alternatives is extremely expensive and slow. Power plants, cars and aircraft are not cellphones: they continue to function for 25-50 years! So there is a need to retrofit security mechanisms into the old insecure designs.In this talk I will survey the leading communication bus protocols, with a focus on their inherent security vulnerabilities: Modbus and the Siemens protocols in industrial control systems, CAN bus in automotive in-vehicle networks, and ARINC429 in civilian aircraft networks. I will then highlight what can be done in the areas of anomaly detection and intrusion prevention. Somewhat surprisingly, these specialized networks sometimes have unique features that can be repurposed to achieve defensive goals.

On 22/12/2021 at 11:30

Click here to watch the recorded talk.

 

 

Naama Ben-David

Postdoctoral Researcher at VMware

On:

Theoretical Foundations for Emerging Multiprocessor Hardware

Due to the end of Moore’s law, hardware has been developing more rapidly in recent years than it has at any point since the early days of computing. These hardware developments are trending toward multiprocessor settings, which have the potential to deliver the speedups that CPU frequency scaling can no longer support. In this talk, I will discuss my work on building theoretical foundations for emerging multiprocessor technologies. I will focus on one line of work that concerns a data center communication primitive called Remote Direct Memory Access (RDMA). RDMA allows accessing the memory of a remote machine without involving its CPU, and has become widely adopted in recent years due to its performance advantages. I’ll present the first theoretical model that captures RDMA’s capabilities, and use it to show that RDMA is more powerful than previous communication technology. I’ll then present the design of a state machine replication system based on those theoretical insights that improves previous state-of-the-art latency both in failure-free executions (by over 2x) and in failure recovery (by over 10x).

Bio: Naama Ben-David is a postdoctoral researcher at VMware. She completed her PhD at Carnegie Mellon University, where her dissertation received an honorable mention for the CMU SCS Dissertation Award. Her primary research interests are in the intersection of theory and practice in distributed and concurrent computing. More specifically, Naama strives to theoretically explain phenomena seen in modern machines, and to use obtained insights to design and analyze practical algorithms for multiprocessor settings. Naama is the recipient of an NSERC postgraduate scholarship and a Microsoft Research PhD Fellowship.

On 21/12/2021 at 10:30

Click here to watch the recorded talk.

 

Michael Schapira

The Hebrew University of Jerusalem

On:

A Tale of Two Internet (In)Security Problems

Despite Herculean efforts, the Internet’s communications infrastructure remains alarmingly vulnerable to attacks. In this talk, I will exemplify the Internet’s security holes through two prominent examples: Internet routing and time synchronization. I will discuss what I view as the main obstacles to progress on securing the Internet and how I believe that these can be overcome. Bio: Michael Schapira is a professor of Computer Science at Hebrew University https://www.michaelschapira.com/. His research focuses on the design and analysis of (Inter)network architectures and protocols. Prior to joining Hebrew U, he worked at Google NYC’s Infrastructure Networking Group and was a postdoctoral researcher at UC Berkeley, Yale University, and Princeton University. He is a recipient of the Allon Fellowship, the Wolf Foundation’s Krill Prize, an ERC Starting Grant, faculty awards from Microsoft, Google, and Facebook,  IETF/IRTF Applied Networking Research Prizes, and the IEEE Communications Society William R. Bennett Prize.

On 15/12/2021 at 11:30

Click here to watch the recorded talk.

 

Amit Klein

The Hebrew University of Jerusalem

On:

From IP ID to Device ID and KASLR Bypass

IPv4 headers include a 16-bit ID field. Our work examines the generation of this field in Windows, Linux and Android, and shows that the IP ID field enables remote servers to assign a unique ID to each device and thus be able to identify subsequent transmissions sent from that device. This identification works across all browsers and over network changes, including VPNs and browser privacy modes. In modern Linux and Android versions, this field leaks a kernel address, thus we also break KASLR, which is a major part of the kernel security defenses.
Our work includes reverse-engineering of the Windows IP ID generation code, and a cryptanalysis of this code and of the Linux kernel IP ID generation code. It provides practical techniques to partially extract the key used by each of these algorithms. We deployed a demo (for Windows) showing that key extraction and machine fingerprinting works in the wild, and tested it from networks around the world.
We disclosed our findings to Microsoft and Linux, and they subsequently deployed patches to mitigate our attacks.

Bio: Amit is currently a post-doc researcher in the Hebrew University. He did his PhD studies at Bar Ilan University. Previously he was the CTO of Trusteer (now an IBM company), and the chief scientist of Cyota (acquired by RSA Security). His research is in the intersection between network protocols and operating system security. His research won the first prize in the CSAW Israel applied research competitions in 2019 and 2020. Aside from academic conferences, his research was frequently presented at top industrial cyber security conferences.

On 8/12/2021 at 11:30

Click here to watch the recorded talk.

Itai Mesery

On:

Towards Physical, Imperceptible Adversarial Attacks via Adversarial Programs

Adversarial examples were originally defined as imperceptible perturbations which cause a deep neural network to misclassify. However, the majority of im-perceptible perturbation attacks require to perturb a large number of pixels across the image and are thus hard to execute in the physical world. Existing physical attacks rely on physical objects, such as patches/stickers or 3D-printed objects. Producing adversarial patches is arguably easier than 3D-printing but these at-tacks incur highly visible perturbations. This raises the question: is it possible to generate adversarial examples with imperceptible patches? In this work, we consider adversarial multi-patch attacks, where the goal is to compute a targeted attack consisting of up to K patches with minimal L2 distortion. Each patch is associated with dimensions, position, and perturbation parameters. We leverage ideas from program synthesis and numerical optimization to search in this large, discrete space and obtain attacks that are competitive with the C&W attack but have at least 3x and up to 10x fewer perturbed pixels. We evaluate our approach on MNIST, Fashion-MNIST, CIFAR-10, and ImageNet and obtain success rate of at least 92% and up to 100% with at most ten patches. For MNIST, Fashion-MNIST, and CIFAR-10, the average L2 distortion is greater than the average L2 distortion of the C&W attack by up to 1.2.

*MSc student under supervision of Dr. Dana Drachsler Cohen

On 3/11/2021 at 11:30

Click here to watch the recorded talk.

 

Ariel Livshitz

On:

Probable Approximate Coordination

We study the problem of how to coordinate the actions of independent agents in an asynchronous distributed system where message arrival times are unbounded, but are determinedby an exponential probability distribution. Specifically, we focus on the task of ordering of actions performed by different nodes in a linear temporal order – a problem known in the literature as Ordered Response. Ensuring such an ordering, even with probability 1, requires the construction of a message chain that passes through each acting site, in order. As a result, any protocol that solves Ordered Response with such guarantees must require linear time inexpectation. We show, however, that relaxing the specification slightly allows for a signifi-cant saving in time. Namely, if Ordered Response should be guaranteed with high probability (smaller than 1), it is often possible to significantly shorten the expected execution time ofthe protocol.

 

*MSc student under supervision of Prof. Yoram Moses.

On 13/10/2021 at 11:30

Click here to watch the recorded talk.

 

 

Israel Lenchner

On:

Task Oriented Programming – a Static Program Checker

Task-oriented programming (ToP) is a programming model whereby a program comprises a set of serial tasks along with a specification of dependencies among them (task map). Sometimes there are also duplicable (multi-instance) tasks, wherein multiple instances of the same code (different data) may run concurrently on multiple cores. If task B depends on A, it may only be dispatched upon A’s completion. A task may furthermore depend on the return code of another task or upon multiple tasks (e.g., D depends on the completion of A AND (B OR C)). This intuitive and convenient model facilitates the expression and discovery of permissible parallelism. Ramon.Space’s RC64 chip is a 64-core processor with shared on-chip memory that uses this model, including duplicable tasks and some customizations. It moreover features a hardware scheduler that receives the task map, keeps track of core and task status and, once a core becomes available, dispatches a runnable task to it within a few clock cycles. The extremely fast dispatching enables efficient exploitation of often abundant but useless (due to overhead) fine-grain parallelism. While attractive and convenient, ToP gives rise to new types of programmer errors. For example, if a programmer neglects to specify a dependence of task B on task A, the two tasks may be executed in any order, possibly resulting in a data race and non-deterministic program results.

In this work, we present a static program-checking tool that focuses on ToP-related errors. Its main components are 1) determining which pairs of tasks are independent, i.e., are allowed by the programmer to run in arbitrary order; 2) determining each task’s memory footprint (read and write accesses); and 3) checking whether any two independent tasks access a common memory address with at least one of them writing to it (data race). Related algorithms were developed and embedded in the tool. (E.g., checking whether two independent tasks may ever both run in the same execution of the program; identifying mutually exclusive conditional accesses of two tasks to the same address (may lead to a false race); and efficiently checking for races among members of a set of mutually-independent tasks). Several important extensions to the ToP model have been proposed, most notably enabling coordinated concurrent execution of multiple duplicable tasks; the tool addresses these extensions, but also has an RC64-specific mode. As such, it is general and forward-looking, yet applicable to an actual product.

*MSc student under supervision of Prof. Yitzhak Birk

On 18/8/2021 at 11:30

 

Meni Orenbach

On:

Operating Systems Abstractions for Trusted Execution Environments

Trusted execution environments such as secure enclaves are now available in several popular CPUs, and supported in public clouds. Enclaves can be used to efficiently shield applications against privileged adversaries, and secure sensitive data processed by them through strong isolation backed by the hardware. Yet, enclaves are not a silver bullet: they are vulnerable to unique side-channel attacks, they exhibit poor performance when system calls are invoked and when page faults occur, they lack a secure variant of software abstractions such as page fault handlers, and finally, the hardware does not protect against Iago attacks.
In our work, we tackle the aforementioned shortcomings of existing enclaves with system abstractions, practical hardware modifications, and tools to support them. In this talk, we provide a high-level overview of our approach followed by presenting TEEProtect, a framework for thwarting Iago attacks.

*PhD student under supervision of Prof. Mark Silberstein

On 30/6/2021 at 11:30

Click here to watch the recorded talk.

Ameer Haj Ali

On:

Machine Learning in Compiler Optimization

The end of Moore’s law is driving the search for new techniques to improve system performance as applications continue to evolve rapidly and computing power demands continue to rise. One promising technique is to build more intelligent compilers.

Compilers map high-level programs to lower-level primitives that run on hardware. During this process, compilers perform many complex optimizations to boost the performance of the generated code. These optimizations often require solving NP-Hard problems and dealing with an enormous search space. To overcome these challenges, compilers currently use hand-engineered heuristics that can achieve good but often far-from-optimal performance. Alternatively, software engineers resort to manually writing the optimizations for every section in the code, a burdensome process that requires prior experience and significantly increases the development time.

In this work, novel approaches for automatically handling complex compiler optimization tasks are explored. End-to-end solutions using deep reinforcement learning and other machine learning algorithms are proposed. These solutions dramatically reduce the search time while capturing the code structure, different instructions, dependencies, and data structures to enable learning a sophisticated model that can better predict the actual performance cost and determine superior compiler optimizations. The proposed techniques can outperform existing state-of-the-art solutions while requiring shorter search time. Furthermore, unlike existing solutions, the deep reinforcement learning solutions are shown to generalize well to real benchmarks.

Bio:

Ameer Haj-Ali completed his Ph.D. in Electrical Engineering and Computer Science at UC Berkeley in two years, where he was advised by Professors Ion Stoica (RISE Lab) and Krste Asanovic (ADEPT Lab). At UC Berkeley Ameer helped bring up/led many projects spanning machine learning in compiler optimization and hardware-software codesign. This includes Gemmini, AutoPhase, NeuroVectorizer, ProTuner, Ansor, AutoCkt, and RLDRM (awarded best paper award). Before attending UC Berkeley, Ameer finished his M.Sc. studies (summa cum laude, the valedictorian) at the Technion in 2018, where he worked with Professor Shahar Kvatinsky on using emerging memory technologies to enhance the performance of modern computer systems.

On 12/5/2021 at 11:30

Click here to watch the recorded talk.

Noam Bloch

VP HW architecture, NVIDIA

On:

The Architecture of NVIDIA Data Center Infrastructure Processing Unit

In Modern data centers, resources are usually virtualized. Applications running on those date centers are distributed over many virtual machines. For those applications, the data centers provide software defined infrastructure services for networking, storage, and security. When software defined services are running within the same CPU as the applications, they consume CPU resources on the expanse of the applications. Moreover, the data center security can be jeopardized

NVIDIA Data Center Infrastructure Processing Unit (DPU) allow migrating all the infrastructure services from the application CPU. This new architecture isolates the data center infrastructure from the application domain running on the CPUs or GPUs making this sensitive code independent on any application. NVIDIA DPU accelerates and abstract the storage, networking, and security service to enable applications to get the most out of the CPUs and GPUs.

In this presentation, the fundamentals of NVIDIA DPU architecture will be presented, along with the NVIDIA DOCA SDK programming mode for the DPU, the system use cases and the impacts of this architecture on next generation data centers.

On 5/5/2021 at 11:30

Eyal Ronen

TAU

On:

Dragonblood: Analyzing the Dragonfly Handshake of WPA3 and EAP-pwd

The WPA3 certification aims to secure home networks, while EAP-pwd is used by certain enterprise WiFi networks to authenticate users. Both use the Dragonfly handshake to provide forward secrecy and resistance to dictionary attacks. In this paper, we systematically evaluate Dragonfly’s security. First, we audit implementations, and present timing leaks and authentication bypasses in EAP-pwd and WPA3 daemons. We then study Dragonfly’s design and discuss downgrade and denial-of-service attacks. Our next and main results are side-channel attacks against Dragonfly’s password encoding method (e.g.~hash-to-curve).

We believe that these side-channel leaks are inherent to Dragonfly. For example, after our initial disclosure, patched software was still affected by a novel side-channel leak. We also analyze the complexity of using the leaked information to brute-force the password. For instance, brute-forcing a dictionary of size 10^10 requires less than $1 in Amazon EC2 instances. These results are also of general interest due to ongoing standardization efforts on Dragonfly as a TLS handshake, Password-Authenticated Key Exchanges (PAKEs), and hash-to-curve. Finally, we discuss backwards-compatible defenses, and propose protocol fixes that prevent attacks. Our work resulted in a new draft of the protocols incorporating our proposed design changes.

*Joint work by Mathy Vanhoef (New York University Abu Dhabi) and Eyal Ronen(TAU)

Short bio:

I’m a faculty member at Tel Aviv University’s School of Computer Science. My research interests are in cybersecurity and applied cryptography. I am interested in analyzing and designing real-world implementations of cryptographic and security protocols and primitives (both in software and in hardware). In particular, my research spans side-channel attacks (e.g., power analysis and cache attacks), cryptographic protocols (e.g., TLS and WPA3), cryptanalysis of cryptographic primitives (e.g., AES), IoT security, and password-based authentication.

On 10/03/2021 at 11:30

Click here to watch the recorded talk.

Ori Lahav

TAU

On:

Designing a Programming Language Shared-Memory Concurrency Semantics

A concurrency semantics (aka a memory model) for a programming language defines the allowed behaviors of multithreaded programs. For programmers, sequential consistency (i.e., standard interleaving-based semantics) is considered as the most intuitive model. However, it is too costly to implement. Designing a satisfactory substitute is highly challenging as it requires to carefully balance the conflicting desires of programmers, compilers, and hardware. In this talk I will introduce this challenge and the key ideas behind the prototype example of the C/C++ concurrency model from 2011. Then, I will demonstrate the drawback of the C/C++ approach, notoriously known as the “out-of-thin-air” problem. I will conclude by describing the “promising semantics” solution, and remaining challenges.

Short bio:
Ori is a faculty member in the School of Computer Science at Tel Aviv University. He did his PhD at Tel Aviv University under the supervision of Arnon Avron in the field of logic for computer science. In 2014, he was a postdoctoral researcher at Tel Aviv University hosted by Mooly Sagiv. After that, until September 2017, he was a postdoctoral researcher at MPI-SWS in Germany hosted by Viktor Vafeiadis and Derek Dreyer. Ori’s current main areas of research are programming languages and verification. Recently, Ori was awarded an ERC Starting Grant for the project titled “Verification-Aware Programming Language Concurrency Semantics”.

On 17/02/2021 at 11:30
Click here to watch the recorded talk.

Adam Morrison

TAU

On:

Comprehensive Protection for Speculatively-Accessed Data

Speculative execution attacks present an enormous security threat, capable of reading arbitrary program data under malicious speculation and later exfiltrating that data over microarchitectural covert channels. This talk will describe a comprehensive hardware protection from speculative execution attacks.
We will first describe Speculative Taint Tracking (STT). STT delays the execution of instructions that create covert channels until their operands are proven to be a function of non-speculative data.
STT builds on a comprehensive characterization of covert channels on speculative microarchitectures and employs a novel microarchitecture for efficiently detecting when operands become non-speculative and disabling protection at that time.
We will then describe Speculative Data-Oblivious Execution (SDO), which improves STT’s performance by executing covert-channel creating instructions in a data-oblivious manner, i.e., so that their execution does not leak their operands. Data-oblivious execution usually implies doing the worst-case work all the time. SDO sidesteps this problem by using safe prediction to predict the work needed to satisfy the common case and subsequently perform it—all without leaking privacy.

On 27/01/2021 at 11:30
Click here to watch the recorded talk.

Julia Khamis

Technion

On:

Demand-Aware Optimization in Offchain Networks

Offchain networks are dominant as a solution to the scalability problem of blockchain systems, allowing users to perform payments without their recording on the chain by relying on predefined payment channels. Users together with the offchain channels form a graph, known as the offchain network topology. A pair of users can employ a payment even without a direct channel through a path of channels involving other intermediate users. The offchain topology and payment characteristics affect network performance such as latency and fees.

We study basic demand-aware problems in offchain network design: Efficiently mapping users to an offchain topology of a known structure as well as constructing a topology of a bounded number of channels that can serve well typical payments. Likewise, we suggest an approach for jointly serving multiple payments by finding an equivalent set of payments that has the same impact on user balance but can be served efficiently in a given topology.

*MSc student under supervision of Prof. Ori Rottenstreich

On 20/01/2021 at 11:30
Click here to watch the recorded talk.

Edward Bortnikov

VP Technology, Pliops

On:

The brave new world of storage processors

Storage processors are a new type of acceleration device that offloads centerpiece storage stack functions from legacy software to high-performance hardware. The legacy software paradigms waste flash capacity and performance, pushing the users to deploy expensive and overprovisioned SSDs. For example, the high write amplification induced by inefficient software limits the applicability of the next-generation low-cost flash technologies like QLC and PLC, as well as ZNS SSDs. Storage processors are purpose-built to exploit flash devices wisely and deliver premium performance at a reduced cost. Last but not least, they relieve the CPU from storage-specific processing, enabling higher utilization by customer applications.

In this talk, we present Pliops Storage Processor (PSP) – a novel accelerator device that decisively improves flash storage performance for a variety of applications and workloads. PSP capitalizes on groundbreaking hardware supported on-flash key-value (KV) store technology, and features accelerator engines for sorting, compression, data protection, and more. PSP provides a standard block device interface, hence the data applications are accelerated transparently, with no need for code changes. For advanced applications, PSP also provides a high-performance KV API which enables SSDs to closely match in-memory performance. We demonstrate the PSP performance advantages for a broad range of applications – block I/O, SQL and NoSQL databases.

On 23/12/2020 at 11:30

Ehud Shapiro

Weizmann Institute

On:

Digital Social Contracts: A Foundation for an Egalitarian and Just Digital Society

Almost two centuries ago Pierre-Joseph Proudhon proposed social contracts — voluntary agreements among free people — as a foundation from which an egalitarian and just society can emerge. A digital social contract is the novel incarnation of this concept for the digital age. Digital social contracts are: Conceptually, voluntary agreements among genuinely-identified people, specified, undertaken, and fulfilled in the digital realm. Mathematically, a novel computational model specifying concurrent, non-deterministic asynchronous agents updating a monotonic shared ledger with digital speech acts. Computationally, distributed programs realizing this abstract model of computation, where people resolve nondeterministic choices of the agents they operate.

We will present a formal definition of digital social contracts; outline an abstract design for a social contracts programming language;  show, via programming examples, that key application areas, including social community, simple sharing-economy applications, egalitarian and just currency networks, and democratic community governance, can all be expressed elegantly and efficiently as digital social contracts; and outline a distributed-ledger implementation of digital social contracts.

Joint work with Luca CardelliLiav OrgadGal Shahaf, and Nimrod Talmon

Reference: arXiv:2005.06261 

Bio:

Ehud Shapiro is an interdisciplinary scientist, entrepreneur, artist and, most recently, a politician. He got his B.A./B.Sc. in Mathematics and Philosophy from Tel Aviv University in 1979, a PhD in Computer Science from Yale University in 1982, and has been with the Weizmann Institute ever since.  His PhD research on inductive logic programming and on algorithmic debugging has served as a foundation for these two disciplines. At Weizmann he developed the discipline of concurrent logic programming and the programming language Concurrent Prolog. In 1993 he founded Ubique Ltd., the first Internet social networking software startup.  In 1995 Ubique launched Virtual Places 1.0, a social networking application providing instant-messaging, chat rooms, joint and guided web-surfing, online events and games, and integrated voice-over-IP, 12 years before Facebook.  He sold Ubique to America Online, performed a management-buy-out, and then sold it again to IBM, where its technology was the basis of SameTime, IBM’s successful corporate network communication and collaboration product.

Upon return to Weizmann in 1998 Ehud switched to Biology, and led research projects on biomolecular computers, synthetic biology, and human cell lineage analysis, for which he received two ERC Advanced Grants.  In the last several years he began working on digital democracy, initially with Nimrod Talmon and later also with collaborators and a research team he formed at Weizmann.  Ehud is a bass singer and the founder and artistic director of the Ba-Rock Band. More recently, he founded a new political party, democratit.

On 9/12/2020 at 11:30

Eylon Yogev

Tel Aviv University and Boston University

On:

Adversarially Robust Streaming Algorithms

A streaming algorithm is given a long sequence of items and seeks to compute or approximate some function of this sequence using a small amount of memory. A body of work has been developed over the last two decades, resulting in optimal streaming algorithms for a wide range of problems.
While randomized streaming algorithms are well-studied, the vast majority of them are defined and analyzed in the static setting, where the stream is assumed to be fixed in advance, and only then the randomness of the algorithm is chosen. In many scenarios, this assumption is unrealistic, making the algorithms prone to adversarial attacks, and unfit for real-life applications.
I will investigate the adversarial robustness of streaming algorithms. An algorithm is considered robust if its performance guarantees hold even if the stream is chosen adaptively by an adversary that observes the outputs of the algorithm along the stream and can react in an online manner. I will describe general-purpose methods we have developed to transform standard streaming algorithms to be secure in the adversarial model. These are the first streaming algorithms to work in this model.

Bio:
Dr. Eylon Yogev is a postdoc at Tel Aviv University hosted by Omer Paneth and Nir Bitasnky and co-affiliated at Boston University hosted by Ran Canetti. Prior to that, he was a research fellow at the Simons Institute and a postdoc at the Technion hosted by Yuval Ishai. He received his Ph.D. and M.Sc. from the Weizmann Institute of Science under the supervision of Moni Naor and completed his B.Sc. in Computer Science at Tel Aviv University. Eylon is broadly interested in cryptography, with a focus on its interplay with the area of algorithm design and analysis.

On 2/12/2020 at 11:30
Click here to watch the recorded talk.

Yatish Turakhia

UC Santa Cruz

On:

Accelerating Biology and Medicine with Hardware Specialization

Genome sequencing data is rising exponentially at a rate (125%/year) that is far higher than our current transistor performance scaling (currently 3%/year). New medical and comparative genomics applications have emerged that ensure that the demand for more sequencing will continue to rise, which threatens to overwhelm our current compute capacities. Domain-specific acceleration (DSA), i.e. using specialized hardware for accelerating a narrow domain of algorithms, will enable us to tap the vast potential of this data by providing massive gains in performance efficiency.
In this talk, I will present the designs of our past co-processors as case-studies for how domain-specific hardware can provide massive speedup (1,000-10,000x) for genomic applications and the various sources of increased efficiency. In the second half of my talk, I will show how this approach would be indispensable to solve a multitude of emerging problems in biology and medicine (including our fight against the current and future pandemics), many of which we will be tackling at my future lab at UC San Diego (UCSD).

Bio:
Dr. Yatish Turakhia is a postdoctoral researcher at Genomics Institute, UC Santa Cruz, where he is jointly advised by Prof. David Haussler
and Prof. Benedict Paten. He will be joining University of California San Diego (UCSD) next year as an Assistant Professor in the
Department of Electrical and Computer Engineering (ECE). His research interests are in developing hardware accelerators and algorithms for
faster and cheaper genomic analysis. Dr. Turakhia obtained his PhD from Stanford University in 2019, where he was jointly advised by Prof.
Bill Dally and Prof. Gill Bejerano. His work has won the best paper award at ASPLOS 2018 and IEEE Micro Top Picks award 2018. He is also a
past recipient of the NVIDIA Graduate Fellowship.

On 24/11/2020 at 16:30 (unusual time and day)

Clara Shikhelman

On:

Lightning Network Economics: Cost-minimal channels and their implications for network structure

A channel in the Lightning Network allows two parties to secure bitcoin payments and escrow holdings between them. Designed to increase transaction immediacy and reduce blockchain congestion, this has the potential to solve many issues associated with Bitcoin.
In this talk, we study the economics of the Lightning Network. We present conditions under which two parties optimally establish a channel and give explicit formulas for channels’ costs. Using these, we derive implications for the network’s structure under cooperation assumptions among small sets of users. We show both local implications, such as the wastefulness of certain structures, and global implications, such as a (low) upper bound on the Lightning Network’s average degree.

On 11/11/2020 at 11:30

Click here to watch the recorded talk.

Loai Danial

On:

Neuromorphic Data Converters using Memristors

Data converters are ubiquitous in mixed-signal systems, becoming the computational bottleneck in traditional data acquisition and emerging brain-inspired neuromorphic systems which aim to accelerate artificial intelligence algorithms. Unfortunately, conventional Nyquist data converters trade off speed, power, and accuracy. Therefore, they are exhaustively customized for special purpose applications. Furthermore, intrinsic real-time and post-silicon variations dramatically degrade their performance along with the CMOS technology downscaling.

This seminar will present novel neuromorphic analog-to-digital (ADC) and digital-to-analog (DAC) converters that are trained using the stochastic gradient descent algorithm to autonomously adapt to different design specifications, including multiple full-scale voltages, number of resolution bits, and sampling frequencies. We experimentally demonstrate the feasibility of our converters using commercial integrated memory arrays, comprising two-terminal floating-gate memristive devices fabricated in the standard 180nm CMOS process and operate at the sub-threshold mode. This memristive array efficiently implements multiply, accumulate, and update operations essential for trainable artificial synapses in neuromorphic systems, at the analog level. Theoretical analysis, as well as simulation results, show the collective resilient properties of our converters in application reconfiguration, mismatches calibration, noise tolerance, and power optimization. We believe that our findings will spur the development of very large-scale integrated memristive neuromorphic systems.

*PhD student under supervision of Prof Shahar Kvatinsky

On 4/11/2020 at 11:30

Roi Ben-Haim

On:

Reliable and Time-efficient Virtualized Function Placement

Reliability and time-efficiency are two key elements to consider in network design. Commonly, each is measured per service – availability probability of a specific service, the latency of a specific service, and overall – system average reliability and system average latency, considering the demand for every service. Intuitively, minimizing latency requires minimizing the number of network elements a service makes use of. In a non-redundant environment, this would also guarantee the maximal reliability of a service, as reliability degrades when the number of elements required to implement the service increases. However, reliability is often guaranteed by allocating redundant, backup resources. We explain that such redundancy or the joint support for multiple services can impose a trade-off between reliability and time-efficiency criteria. We show in what cases an optimal solution to the latency problem is an optimal solution to the reliability problem and vice-versa. For the cases in which the trade-off exists we examine if we can bound the distance to an optimal solution of a given criterion (latency/reliability) using the optimal solution of the other criterion. Our analytical model considers networks following a virtualized paradigm – network functions are implemented on commodity servers and a network service is implemented by chaining functions.

*M.Sc student under supervision of assistant Prof. Ori Rottenstreich

On 14/10/2020 at 11:30

Click here to watch the recorded talk.

Ido Zoref

On:

Efficient Algorithm for Finding Sets of Optimal Routes

In several important routing contexts it is required to identify a set of routes, each of which optimizes a different criterion. For instance, in the context of vehicle routing, one route would minimize the total distance traveled, while other routes would also consider the total travel time or the total incurred cost, or combinations thereof. In general, providing such a set of diverse routes is obtained by finding optimal routes with respect to different sets of weights on the network edges. This can be simply achieved by consecutively executing a standard shortest path algorithm. However, in the case of a large number of weight sets, this may require an excessively large number of executions of such an algorithm, thus incurring a prohibitively large running time.

We indicate that, quite often, the different edge weights reflect different combinations of some “raw” performance metrics (e.g., delay, cost). In such cases, there is an inherent dependency among the different weights of the same edge. This may well result in some similarity among the shortest routes, each of which being optimal with respect to a specific set of weights. In this study, we aim to exploit such similarity in order to improve the performance of the solution scheme.

Specifically, we contemplate edge weights that are obtained through different linear combinations of some (“raw”) edge performance metrics. We establish and validate a novel algorithm that efficiently computes a shortest path for each set of edge weights. We demonstrate that, under reasonable assumptions, the algorithm significantly outperforms the standard approach. Similarly to the standard approach, the algorithm iteratively searches for routes, one per set of edge weights; however, instead of executing each iteration independently, it reduces the average running time by skillfully sharing information among the iterations.

*M.Sc student under supervision of Prof. Ariel Orda

On 23/09/2020 at 11:30

Roi Bar Zur

On:

Efficient MDP Analysis for Selfish-Mining in Blockchains

A proof of work (PoW) blockchain protocol distributes rewards to its participants, called miners, according to their share of the total computational power. Sufficiently large miners can perform selfish mining – deviate from the protocol to gain more than their fair share. Such systems are thus secure if all miners are smaller than a threshold size so their best response is following the protocol.

To find the threshold, one has to identify the optimal strategy for miners of different sizes, i.e., solve a Markov Decision Process (MDP). However, because of the PoW difficulty adjustment mechanism, the miners’ utility is a non-linear ratio function. We therefore call this an Average Reward Ratio (ARR) MDP. Sapirshtein et al. were the first to solve ARR MDPs by solving a series of standard MDPs that converge to the ARR MDP solution.

We present a novel technique for solving an ARR MDP by solving a single standard MDP. The crux of our approach is to augment the MDP such that it terminates randomly, within an expected number of rounds. We call this Probabilistic Termination Optimization (PTO), and the technique applies to any MDP whose utility is a ratio function. We bound the approximation error of PTO – it is inversely proportional to the expected number of rounds before termination, a parameter that we control. Empirically, PTO’s complexity is an order of magnitude lower than the state of the art.

PTO can be easily applied to different blockchains. We use it to tighten the bound on the threshold for selfish mining in Ethereum.

*M.Sc student advised by Prof. Ittay Eyal and Prof. Aviv Tamar

On 02/09/2020 at 11:30

Shir Cohen

On:

Not a COINcidence: Sub-Quadratic Asynchronous Byzantine Agreement WHP

King and Saia were the first to break the quadratic word complexity bound for Byzantine Agreement in synchronous systems against an adaptive adversary, and Algorand broke this bound with near-optimal resilience (first in the synchronous model and then with eventual-synchrony). Yet the question of asynchronous sub-quadratic Byzantine Agreement remained open. To the best of our knowledge, we are the first to answer this question in the affirmative. A key component of our solution is a shared coin algorithm based on a VRF. A second essential ingredient is VRF-based committee sampling, which we formalize and utilize in the asynchronous model for the first time. Our algorithms work against a delayed-adaptive adversary, which cannot perform after-the-fact removals but has full control of Byzantine processes and full information about communication in earlier rounds. Using committee sampling and our shared coin, we solve Byzantine Agreement with high probability, with a word complexity of O˜(n) and O(1) expected time, breaking the O(n^2) bit barrier for asynchronous Byzantine Agreemen

*M.Sc student advised by Prof. Idit Keidar

On 10/08/2020 at 11:00

Yael Doweck

On:

Multi-Party Timed Commitments

The problem of obtaining secret commitments from multiple parties and revealing them after a certain time is useful for sealed-bid auctions, games, and other applications. Existing solutions, dating back to Rivest, Shamir and Wagner, either do not scale or rely on synchrony for the commitment phase and trust of t/n parties. We formalize the problem of implementing such commitments with a probabilistic delay and without the aforementioned assumptions as Multi-Party Timed Commitments (MPTC) and present a solution — the Time-Capsule protocol. Like previous approaches, Time Capsule forms a puzzle whose solution reveals the committed values. But unlike previous solutions, no party has an advantage in solving the puzzle, and individual commitments cannot be revealed before the entire set is committed. A particular application of MPTC realizes an advancement in the study of decentralized systems. The state of the art in decentralized systems is manifested in blockchain systems that utilize Proof of Work to achieve censorship resistance. However, they are still vulnerable to frontrunning, an issue that is plaguing operational systems. By adapting Time Capsule, we allow it to be used for Proof of Work, preventing frontrunning by system operators and tuning the puzzle difficulty using the blockchain mechanism.

*M.Sc student advised by Prof. Ittay Eyal

On 24/06/2020 at 11:30

Click here to watch the recorded talk.

Mahmood Sharif

On:

The Security of Machine Learning in the Real World and Machine Learning for Personalized Security

In the first part of this two-part talk, I will describe my work in adversarial machine learning (ML). Prior work has demonstrated that ML algorithms are vulnerable to evasion attacks at inference time by so-called adversarial examples. However, the implications of attacks in practice, under real-world constraints, remained largely unknown. To fill the gap, we propose evasion attacks that satisfy multiple objectives (e.g., smoothness and robustness against changes in imaging conditions), and show that these attacks pose a practical threat to ML-based systems. For example, we demonstrate that attackers can realize eyeglasses they can don to evade face recognition or impersonate other individuals. To produce attacks systematically, we develop a framework that can accommodate a wide range of desired objectives, including ones that elude precise specification (e.g., attacks’ inconspicuousness). In a complementary line of work, we propose n-ML—a defense that trains an ensemble of n classifiers to classify inputs by a vote. Unlike other ensemble-based approaches, n-ML trains each classifier to classify adversarial examples differently than other classifiers, thus rendering it difficult for adversarial examples to obtain enough votes to be misclassified. We show that n-ML outperforms state-of-the-art defenses in several settings.

In the second part of the talk, I will describe my work in human factors in security and privacy. Understanding users (their behaviors, perceptions, preferences, …) is helpful for developing user-centered defenses that would be widely adopted and used as intended, thus enhancing users’ security and privacy. In my research, I employ methods from data science and social sciences to draw insights from datasets of various sizes and inform how to improve defenses. In particular, I seek to develop more effective, less intrusive defenses by personalizing them to individual users. I present two proof-of-concept defenses that leverage users’ behavior, preferences, and available context for personalization—one to predict whether users browsing the Internet would be exposed to malicious content, and another to determine users’ comfort with online tracking.

On 17/06/2020 at 10:30

Click here to watch the recorded talk.

Mark D. Hill

University of Wisconsin-Madison

On:

Accelerator-level Parallelism

Computer system performance has improved due to creatively using more transistors (Moore’s Law) in parallel via bit-, instruction-, thread-, and data-level parallelism. With the slowing of technology scaling, a way to further improve computer system performance under energy constraints is to employ hardware accelerators. Each accelerator is a hardware component that executes a targeted computation class faster and usually with (much) less energy. Already today, many chips in mobile, edge and cloud computing concurrently employ multiple accelerators in what we call accelerator-level parallelism (ALP). 

This talk develops our hypothesis that ALP will spread to computer systems more broadly. ALP is a promising way to dramatically improve power-performance to enable broad, future use of deep AI, virtual reality, self-driving cars, etc. To this end, we review past parallelism levels and the ALP already present in mobile systems on a chip (SoCs). We then aid understanding of ALP with the Gables model and charge computer science researchers to develop better ALP “best practices” for: targeting accelerators, managing accelerator concurrency, choreographing inter-accelerator communication, and productively programming accelerators. This joint work with Vijay Janapa Reddi of Harvard. See also: https://www.sigarch.org/accelerator-level-parallelism/

Bio:

Mark D. Hill (http://www.cs.wisc.edu/~markhill) is John P. Morgridge Professor and Gene M. Amdahl Professor of Computer Sciences at the University of Wisconsin-Madison, where he also has a courtesy appointment in Electrical and Computer Engineering. His research interests include parallel-computer system design, memory system design, and computer simulation. He received the 2019 Eckert-Mauchly Award and is a fellow of IEEE and the ACM. He serves as Chair of the Computer Community Consortium (2018-19) and served as Wisconsin Computer Sciences Department Chair 2014-2017. Hill has a PhD in computer science from the University of California, Berkeley.

On 03/06/2020 at 16:00

Click here or here to watch the recorded talk.

Prof. Emina Soljanin

Rutgers University

On:

Service Rates in Distributed Systems with Redundancy

Applications such as distributed learning and edge computing strive to maximize the number of service requests (e.g., for data access) that can be concurrently executed by the system. Redundancy, in the form of simple replication and erasure coding, has emerged as an efficient and robust way to enable simultaneous access of different data objects by many users competing for the system’s resources. Here, replication and coding of data affect the rates at which users can be simultaneously served. In this talk, we will first introduce the notion of the service rate region of a redundancy scheme, and present some examples where this region is known. We will then explain the recently recognized connections with batch and switch codes and combinatorial optimization on graphs. We will discuss some systems issues as well.

Emina Soljanin is a professor of Electrical and Computer Engineering at Rutgers. Before moving to Rutgers  in January 2016, she was a (Distinguished) Member of Technical Staff for 21 years in various incarnations of the Mathematical Sciences Research Center of Bell Labs. Her interests and expertise are wide, currently ranging from distributed computing to quantum information science. She is an IEEE Fellow, an outstanding alumnus of the Texas A&M School of Engineering, the 2011 Padovani Lecturer, a 2016/17 Distinguished Lecturer, and 2019 President for the IEEE Information Theory Society.

On 04/3/2020 at 11:30

in Taub 201, CS Building
Technion City

Or Sattath

Ben-Gurion University

On:

Quantum money and some of its extensions

One of the unique aspects of quantum mechanics is the no-cloning theorem. This theorem has direct applications in quantum crpytography: quantum money is very similar to the cash we use, but in addition, it is physically impossible to forge.

In this talk, I will discuss quantum money, and some of its extensions (quantum coins, semi quantum money, tokens for digital signatures, quantum copy protection, and quantum lightning). I will also descrube a way to use quantum money in a system such as Bitcoin. Here, the main challenge is that the mint is not trusted – unlike the standard setting were the mint is assumed to be trusted.

On 08/1/2020 at 11:30

in 861, Electrical Eng. Building
Technion City

Ameer Haj Ali

UC Berkeley

On:

Deep Reinforcement Learning in Compiler

Compilers are designed today to use fixedAbstract -cost models that are based on heuristics to make different code optimizations. However, these models are unable to capture the data dependency, the computation graph, or the organization of instructions. Therefore, with these models compilers achieve suboptimal performance. In his talk, Ameer will show how to use deep reinforcement learning to overcome such hard compiler challenges. This includes the NP-Hard compiler phase ordering challenge (AutoPhase, FCCM2019, SysML2020) and automatically tuning compiler pragmas such as vectorization (NeuroVectorizer, CGO2020).

 

On 08/1/2020 at 15:30

in 815, Electrical Eng. Building
Technion City

24/12/2019 at 14:30

Ran Ben Basat

Harvard University

On:

Leveraging Programmable Switches for In-network Computing

The network line rate is constantly on the rise to support the exploding amounts of data. This means that we have less time to process individual packets, despite a growing demand for better network telemetry. Moreover, CPU speeds are not rising at the same rate as we near the end of Moore’s law, making it harder to rely on software computations. Programmable hardware switches are an emerging technology that enables flexible packet processing while optimizing for throughput and latency.
In this talk, I will introduce algorithms that leverage programmable switches for accelerating database operations and for improving network telemetry. Switches are orders of magnitude more efficient than traditional hardware accelerators, exist in the datapath, and are ideal for computation offloading. For telemetry, we will discuss how switches can probabilistically encode information across multiple packets to provide fine-grained network visibility with minimal overheads.

Bio:

Ran is a postdoctoral fellow at Harvard University working on network monitoring and algorithms. He is partially sponsored by the Zuckerman and the Hiroshi Fujiwara Cyber Security Research Center postdoctoral fellowships. Ran received his B.Sc., M.Sc., and P.hD., from the Computer Science department at Technion.

18/12/2019 at 11:30

Dr. Eyal Ronen

Tel Aviv University and Catholic University of Leuven

 

On:

 

Breaking (and Fixing) Real World Crypto

In recent years, new forms of communication between people and devices have revolutionized our daily lives. The Internet has become the leading platform for human interaction (e.g., social networks), commerce, information, and also control of physical devices (e.g., Internet of Things). This new connectivity creates new security and privacy risks for individual users and organizations. It also increases the complexity and diversity of the different security and cryptographic solutions we need to protect against increasingly sophisticated and motivated attackers. Designing and implementing a secure system is a very elusive process. One needs to clearly identify the security targets (e.g., maintaining the confidentiality of the messages or preventing access from non-authorized entities) as well as the adversarial capabilities. In this talk, I will show how we can combine cryptanalytic techniques with various side-channels to break the security guarantees of real-world implementations of cryptographic protocols, and how novel solutions can help mitigate the root causes of these vulnerabilities.

27/11/2019 at 11:30

Neta Rozen-Schiff

On:

Preventing (Network) Time Travel with Chronos

The Network Time Protocol (NTP) synchronizes time across computer systems over the Internet. Unfortunately, NTP is highly vulnerable to “time shifting attacks”, in which the attacker’s goal is to shift forward/backward the local time at an NTP client. NTP’s security vulnerabilities have severe implications for time-sensitive applications and for security mechanisms, including TLS certificates, DNS and DNSSEC, RPKI, Kerberos, BitCoin, and beyond. While technically NTP supports cryptographic authentication, it is very rarely used in practice and, worse yet, timeshifting attacks on NTP are possible even if all NTP communications are encrypted and authenticated.

We present Chronos, a new NTP client that achieves good synchronization even in the presence of powerful attackers who are in direct control of a large number of NTP servers. Importantly, Chronos is backwards compatible with legacy NTP and involves no changes whatsoever to NTP servers. Chronos leverages ideas from distributed computing literature on clock synchronization in the presence of adversarial (Byzantine) behavior. A Chronos client iteratively “crowdsources” time queries across multiple NTP servers and applies a provably secure algorithm for eliminating “suspicious” responses and averaging over the remaining responses. Chronos is carefully engineered to minimize communication overhead so as to avoid overloading NTP servers.

We evaluate Chronos’ security and network efficiency guarantees via a combination of theoretical analyses and experiments with a prototype implementation. Our results indicate that to succeed in shifting time at a Chronos client by over 100ms from the UTC, even a powerful man-in-the-middle attacker requires over 20 years of effort in expectation.

This is a joint work with Omer Deutsch, Danny Dolev, Michael Schapira.

Bio:

Neta Rozen-Schiff is a research associate at the School of Computer Science and Engineering of the Hebrew University of Jerusalem, where she works with Prof. David Hay and Prof. Michael Schapira. Her research interests include optimization and control in datacenters, congestion control, network security, and software-defined networking. Prior to joining the Hebrew university, Neta spent a decade in the industry, working as a software engineer and as an operations researcher. Neta holds a BSc and MSc in Mathematics from Bar-Ilan University, and a PhD from the Technion, awarded in 2004, 2005, and 2011, respectively.

20/11/2019 at 11:30

Noam Leiter

On:

Beresheet’s Journey to The Moon

SpaceIL made history when Beresheet, the first privately funded spacecraft, have reached the Moon with a minute budget of 100M$.

After eight years of planning, Beresheet’s journey started on February 22nd, 2019, when SpaceIL’s lunar lander has been launched to space on top of a Falcon 9 launcher. During its 49 days space journey, Beresheet performed a series of successful autonomous maneuvers that brought it from Earth orbit to lunar orbit, and on April 11th, 2019 after orbiting the Moon for seven days the landing sequence was initiated. A few minutes into the landing a series of events ended in the main thruster shutdown and Beresheet crashed into the Moon, adding Israel to the exclusive club of countries that reached the moon and inspiring millions from across the globe to dream big.

This talk will unravel some of the algorithmic and software technological challenges of this fascinating journey and the important lessons learned.

Bio:

Noam Leiter has received his B.Sc. in 2011 and M.Sc. in 2015, both in aerospace engineering at the Technion. He has been working as a control engineer for SpaceIL from 2014 until 2019, developing spacecraft guidance and control algorithms. He is currently pursuing a Ph.D. in the faculty of aerospace engineering at the Technion under the supervision of Prof. Dan Zelazo.

13/11/2019 at 11:30

Michael Gandelman

On:

Treeplication: An Erasure Code for Decentralized Storage System

In this talk we present a new (rateless) erasure code called Treeplication, which features the benefits of both coding and replication. From the system’s perspective, Treeplication allows recovering large distributed data items with better probability than replication, and with lower complexity than known erasure codes. From the coding perspective, a Treeplication code for k data fragments is defined on a binary tree with 2k-1 vertices, along with a distribution for selecting code fragments from the tree layers. The tree structure allows to optimize the recoverability of random subsets of code fragments, while at the same time behaving similarly to replication in recovering individual data fragments (in particular, most code symbols in the optimal distributions turn out to be systematic information symbols). In addition to optimizing Treeplication for recoverability, our results include analysis of their recovery communication cost, dynamic optimization of their redundancy, and an algorithm for decentralized maintenance of a Treeplication-coded system.

*M.Sc student advised by Prof. Yuval Cassuto

10/07/2019 at 11:30

Lilia Daniel

Viterbi Faculty of Electrical Engineering, Technion

On:

Preserving locality across virtual and physical address spaces to accelerate address translation for virtual machines

Applications’ memory consumption is growing rapidly, which increases their dependence on the memory system. Since each virtual-to-physical memory translation in modern systems requires cache lookups or multiple memory accesses, the run time overhead of virtual memory translation is on the rise. Meanwhile, the increasing prevalence of the cloud ecosystem and server consolidation moves more applications to run in virtual machines. This further aggravates the overhead of memory translation; in virtual machines, the processor needs to translate a virtual address in the guest to a physical address in the host, which quadruples the number of memory accesses required for translation. To alleviate the translation overhead, modern processors include additional levels of translation caches, such as 2nd level TLB and partial walk caches.

Our observations show that a careful construction of the page table can make better use of the added translation caches. We introduce a smart physical frame allocation algorithm as a complement for current mapping schemes for virtual machine memory subsystem. The main idea behind the algorithm is to map a contiguous region for each application in order to exhibit spatial locality. We show that careful allocation can improve performance by up to 50% for some applications.

*     M.Sc. seminar under supervision of Prof. Yoav Etsion

3/07/2019 at 11:30

Haggai Eran, Meni Orenbach and Tanya Brokhman

The Accelerated Computing Systems Lab, Technion

On:

NICA: An Infrastructure for Inline Acceleration of Network Applications

Haggai Eran, Technion—Israel Institute of Technology & Mellanox Technologies; Lior Zeno, Maroun Tork, Gabi Malka, and Mark Silberstein, Technion—Israel Institute of Technology

With rising network rates, cloud vendors increasingly deploy FPGA-based SmartNICs (F-NICs), leveraging their inline processing capabilities to offload hypervisor networking infrastructure. However, the use of F-NICs for accelerating general-purpose server applications in clouds has been limited.

NICA is a hardware-software co-designed framework for inline acceleration of the application data plane on F-NICs in multi-tenant systems. A new ikernel programming abstraction, tightly integrated with the network stack, enables application control of F-NIC computations that process application network traffic, with minimal code changes. In addition, NICA’s virtualization architecture supports fine-grain time-sharing of F-NIC logic and provides I/O path virtualization. Together these features enable cost-effective sharing of F-NICs across virtual machines with strict performance guarantees.

We prototype NICA on Mellanox F-NICs and integrate ikernels with the high-performance VMA network stack and the KVM hypervisor. We demonstrate significant acceleration of real-world applications in both bare-metal and virtualized environments, while requiring only minor code modifications to accelerate them on F-NICs. For example, a transparent key-value store cache ikernel added to the stock memcached server reaches 40 Gbps server throughput (99% line-rate) at 6 μs 99th-percentile latency for 16-byte key-value pairs, which is 21× the throughput of a 6-core CPU with a kernel-bypass network stack. The throughput scales linearly for up to 6 VMs running independent instances of memcached.

CoSMIX: A Compiler-based System for Secure Memory Instrumentation and Execution in Enclaves

Meni Orenbach, Technion; Yan Michalevsky, Anjuna Security; Christof Fetzer, TU Dresden; Mark Silberstein, Technion

Hardware secure enclaves are increasingly used to run complex applications. Unfortunately, existing and emerging enclave architectures do not allow secure and efficient implementation of custom page fault handlers. This limitation impedes in-enclave use of secure memory-mapped files and prevents extensions of the application memory layer commonly used in untrusted systems, such as transparent memory compression or access to remote memory.

CoSMIX is a Compiler-based system for Secure Memory Instrumentation and eXecution of applications in secure enclaves. A novel memory store abstraction allows implementation of application-level secure page fault handlers that are invoked by a lightweight enclave runtime. The CoSMIX compiler instruments the application memory accesses to use one or more memory stores, guided by a global instrumentation policy or code annotations without changing application code.

The CoSMIX prototype runs on Intel SGX and is compatible with popular SGX execution environments, including SCONE and Graphene. Our evaluation of several production applications shows how CoSMIX improves their security and performance by recompiling them with appropriate memory stores. For example, unmodified Redis and Memcached key-value stores achieve about 2× speedup by using a self-paging memory store while working on datasets up to 6× larger than the enclave’s secure memory. Similarly, annotating a single line of code in a biometric verification server changes it to store its sensitive data in Oblivious RAM and makes it resilient against SGX side-channel attacks.

GAIA: An OS Page Cache for Heterogeneous Systems

Tanya Brokhman, Pavel Lifshits, and Mark Silberstein, Technion

We propose a principled approach to integrating GPU memory with an OS page cache. We design GAIA, a weakly-consistent page cache that spans CPU and GPU memories. GAIA enables the standard mmap system call to map files into the GPU address space, thereby enabling data-dependent GPU accesses to large files and efficient write-sharing between the CPU and GPUs. Under the hood, GAIA (1) integrates lazy release consistency protocol into the OS page cache while maintaining backward compatibility with CPU processes and unmodified GPU kernels; (2) improves CPU I/O performance by using data cached in GPU memory, and (3) optimizes the readahead prefetcher to support accesses to files cached in GPUs. We prototype GAIA in Linux and evaluate it on NVIDIA Pascal GPUs. We show up to 3× speedup in CPU file I/O and up to 8× in unmodified realistic workloads such as Gunrock GPU-accelerated graph processing, image collage, and microscopy image stitching.

27/06/2019 at 13:30

Daniel Firestone

Azure Host Networking

On:

Accelerating Software Defined Networking in Microsoft Azure

Modern cloud architectures rely on each server running a software-defined networking stack to implement policy and virtualization. However, these networking stacks are becoming increasingly complex as features are added and as network speeds increase. Running these stacks on CPU cores takes away processing power from VMs, increasing the cost of running cloud services, and adding latency and variability to network performance, which has led us to invest in technologies to accelerate SDN in NIC hardware without sacrificing software programmability. We’ll present Azure Accelerated Networking, which uses FPGA-based Azure SmartNICs to provide direct host-bypass networking to VMs while implementing SDN policy, and describe our evolution from soft switching to programmable SmartNICs, comparing what we can do in switches vs hosts, and different SmartNIC approaches. We’ll also talk about our related work on Cloud DPDK, with which we’ve enabled accelerated NFV scenarios inside Azure VMs on top of SmartNICs.

Bio:

Daniel Firestone is the principal tech lead and group manager for the Azure Host Networking team. His team builds the Azure virtual switch, which serves as the datapath for Azure virtual networks, load balancers, security groups, QoS, and more, as well as SmartNIC, the Azure platform for offloading host network functions to reconfigurable FPGA hardware, and the RDMA accelerated transport layer behind Azure Storage. Before Azure, Daniel did his undergraduate studies at MIT.

24/06/2019 at 14:30

Prof. Rachit Agarwal

Cornell University

On:

Networks that Never Drop Packets

Datacenter congestion control remains to be a hard problem. The root of the problem is that the ability of end-hosts to react to congestion is fundamentally limited by the timeliness and precision of congestion signals from the network. Unfortunately, despite decades of research, the community is still in quest of such timely and precise congestion signals. In this talk, I will discuss a possible new approach to datacenter congestion control problem: designing network fabrics that never drop packets at network switches. This results in localizing the congestion to egress queues of the end-hosts, precisely where congestion control is implemented. I will discuss how this approach can not only achieve latency and throughput close to state-of-the-art protocols in the common-case scenario, but also provide worst-case performance bounds. I will also discuss how this approach opens up many interesting avenues of systems research.

Bio:

Rachit Agarwal is an assistant professor at Cornell University, where he leads a wonderful group of students working on systems, networking and security. Prior to coming to Cornell, Rachit was a workaholic postdoc at UC Berkeley, a lazy PhD student at UIUC and a mischievous undergrad at IIT Kanpur. His webpage has more details on his students and their projects.

05/06/2019 at 11:30

Maroun Tork

Viterbi Faculty of Electrical Engineering, Technion

On:

Lynx: a SmartNIC-driven accelerator-centric architecture for network servers

Today’s ever-rising demands for fast computing services and growing network rates are driving the deployment of a wide range of compute (GPU,TPU,FPGAs) and I/O (programmable SmartNICs) hardware accelerators in data centers.

This work explores new opportunities brought afforded by these technologies to improve the performance and efficiency of hardware-accelerated computing services.

To this end, we propose Lynx, an accelerator-centric network server architecture which offloads the server data and control plane to the SmartNIC and enables direct networking from accelerators via a lightweight hardware-friendly I/O mechanism. Lynx enables the design of hardware-accelerated network servers that run without CPU involvement, freeing CPU cores and improving performance isolation for accelerated services.

We implement and evaluate Lynx on two GPUs and two SmartNIC architectures, one with an FPGA and another with an 8-core ARM processor. Lynx achieves over 4x higher throughput for a GPU-centric face verification server and 25% higher throughput for a real-life neural network inference service compared to a traditional host-centric approach. For this workload, a SmartNIC-driven Lynx-based server is projected to scale up to 100 GPUs using a single SmartNIC without using the host CPU.

*     M.Sc. seminar under supervision of Prof. Mark Silberstein.

29/05/2019 at 11:30

Oded Green

Nvidia USA

On:

Memory Channels Are Needed for High Performance Data Analytics

Abstract:

Graph processing is typically considered to be a memory-bound rather than compute-bound problem. One common line of thought is that more available memory bandwidth corresponds to better graph processing performance. However, in this talk, I will show that this is not necessarily the case. I will demonstrate that the key factor in the utilization of the memory system for graph algorithms is not the raw bandwidth, or even latency of memory requests, but instead is the number of memory channels available to handle small data transfers with low locality.

This work was done in collaboration with James, Dr. Jeff Young, Dr. Jun Shirako, and Prof. David Bader.

Using several widely used graph frameworks, including Gunrock (on the GPU) and GAPBS & Ligra (for CPUs), we characterize two very distinct memory hierarchies with respect to key graph analytics kernels. Our results show that the differences in peak bandwidths of several of the Pascal-generation GPU memory subsystems aren’t reflected in the performance of various analytics. Furthermore, our experiments on CPU and Xeon Phi systems show that the number of memory channels utilized can be a decisive factor in performance across several different applications. For CPU systems with smaller thread counts, the memory channels can be underutilized while systems with high thread counts can oversaturate the memory subsystem, which leads to limited performance. Lastly, we model the performance of including more channels with narrower access widths than those found in existing memory subsystems, and we analyze the trade-offs in terms of the two most prominent types of memory accesses found in graph algorithms, streaming and random accesses.

Bio:

Dr. Oded Green is a Senior Graph Software Engineer with NVIDIA’s AI infrastructure. Oded is also currently an Adjunct Research Scientist at the Georgia Institute of Technology (Georgia Tech) in Computational Sciences and Engineering, where he also received his PhD. Oded received both his MSc in electrical engineering and his BSc in computer engineering from Technion – Israel Institute of Technology.

Oded’s research primarily focuses on improving the performance and scalability of large-scale data analytics, with an emphasis on static & dynamic graph analytics. In recent years, Oded has also worked on designing and implementing efficient sorting algorithms for a wide range of accelerators, including GPUs. Oded is also very interested in architecture-algorithm codesign.

15/05/2019 at 11:30

Alex Manuskin

Viterbi Faculty of Electrical Engineering, Technion

On:

Ostraka: Blockchain Scaling by Node Sharding

With their introduction in 2009, cryptocurrencies brought the promise of a global decentralized payment system. However, protocol limitations prevent them from managing the required throughput. Previous work overcame protocol limitations and used low-degree sharding to allow for parallelization. All these systems rely on efficient validation of the transactions, in each shard or globally.

We present Ostraka, an architecture for arbitrary scaling of blockchain nodes that does not affect the security properties of the underlying protocol. Ostraka shards the nodes themselves, embracing the fact major actors have the resources to deploy multi-server nodes. Our novel sharding and communication scheme removes bottlenecks while avoiding exposure to Denial-ofService attacks, and may be of independent interest.

We evaluate analytically and experimentally block propagation and processing in various settings. Ostraka achieves linear scaling when the network allows it. Nodes reach a rate of nearly 400 000 transactions per second with 64 shards.

*       MSc seminar under supervision of Prof.  Ittay Eyal.

01/05/2019 at 11:30

Oded Naor

Viterbi Faculty of Electrical Engineering, Technion

On:

Teechain: A Secure Asynchronous Blockchain Payment Network

In recent years, cryptocurrencies, and blockchain as their underlying technology have emerged. They act as a decentralized ledger, i.e., an append-only log, where payments are recorded. One of the key limitations of almost all blockchains is the overall throughput, e.g., in Bitcoin, the first cryptocurrency, the overall throughput is about 7 transactions per second (TPS), whereas VISA’s maximum throughput is between three to four orders of magnitude higher. One of the main reasons this problem occurs is due to the fact the state of the blockchain is replicated among all participants. This limitation is known as the scalability problem and is one of the most researched topics in this field.

One of the proposed solution to tackle the scalability problem of cryptocurrencies is payment channel networks. In this approach, users interact with the blockchain to place a deposit transaction, and they use this deposit to form a payment channel with other users. Then, all the payments on the channel are made off-chain, i.e., not written on the blockchain at all, which saves space on the blockchain. When one of the parties using the channel wants to settle, she writes a settlement transaction, which redistributes the initial deposits according to the last state of the payment channel. This approach removes all the non-initial and settlement transactions from the blockchain, thus freeing up space on the blockchain for other transactions.

Teechain is a proposed payment channel network protocol, that uses trusted execution environments (TEEs), secure hardware, found on almost all recent commodity CPUs, that offers high security guarantees for pieces of code that run on it, such as confidentiality, integrity and the ability to attest to the validity of the code that runs on it.

In this research thesis, we explore the Teechain protocol, describe it, and show various evaluation results of the implementation of the protocol. This work focuses on the security of the protocol–it defines the key security interest any payment network should have, and rigorously analyzes the Teechain protocol to formally prove that it achieves the desired security guarantees, The proof is done in a framework named the Universal-Composability (UC) framework. In a nutshell, proof in the UC framework is done by describing an ideal functionality in an ideal world, which acts as a trusted third party that captures the desired characteristics of the protocol. The proof is consisted of showing that for any execution in the real-world with parties interacting with the real protocol (i.e., Teechain), is indistinguishable from a simulated execution which interacts with the ideal functionality in the ideal world. Thus, every attack in the ideal world is indistinguishable from a similar attack in the real-world, and if the ideal functionality is secure, then so is the real world protocol.

17/04/2019 at 11:30

Vitaly Shmatikov

Cornell Tech

 On:

What Are Machine Learning Models Hiding?

Modern machine learning models exhibit super-human accuracy on tasks from image classification to natural-language processing, but accuracy does not tell the entire story of what these models have learned. Does a model memorize and leak its training data? Does it contain hidden backdoor functionality? In this talk, I will explain why common metrics of model quality may hide potential security and privacy vulnerabilities, and outline recent results and open problems at the junction of machine learning and privacy research.

Bio:

Vitaly Shmatikov is a professor of computer science at Cornell Tech, where he works on computer security and privacy. Vitaly’s research group received the Caspar Bowden PET Award for Outstanding Research in Privacy Enhancing Technologies in 2008, 2014, and 2018, and multiple best-paper awards. Prior to joining Cornell, Vitaly worked as a faculty member at the University of Texas at Austin and computer scientist at SRI International.

13/03/2019 at 11:30

János Tapolcai

Budapest University of Technology and Economics, Hungary

On:

Network Vulnerabilities against Disasters

Abstract:

We have witnessed worldwide network outages after large-scale natural disasters, such as earthquakes, hurricane, and terrorist attacks. In the talk, I will discuss how to extend the widely implemented single link failure models for regional failures. I will mainly focus on the related theoretical problems in the domain of combinatorics and computational geometry. First, I will show that operators need to prepare their network only against a small number of failure-states. Second, I introduce a stochastic model of geographically correlated link failures caused by disasters, which can estimate the hazards. With the proposed techniques a more comprehensive service availability analysis can be performed to answer many related questions, such as where to place VMs in order to guarantee a certain SLA.

Bio:

János Tapolcai received the M.Sc. degree in technical informatics and the Ph.D. degree in computer science from the Budapest University of Technology and Economics (BME), Budapest, in 2000 and 2005, respectively, and the D.Sc. degree in engineering science from the Hungarian Academy of Sciences (MTA) in 2013. He is currently a Full Professor with the High-Speed Networks Laboratory, Department of Telecommunications and Media Informatics, BME. He has authored over 150 scientific publications. His current research interests include applied mathematics, combinatorial optimization, optical networks, and IP routing, addressing, and survivability. He was a recipient of several Best Paper Awards, including ICC’06, DRCN’11, HPSR’15, and NaNa’16. He is a winner of the MTA Lendület Program and the GoogleFaculty Award in 2012, Microsoft Azure Research Award in 2018. He wasa TPC member of IEEE INFOCOM 2012-2017 and the general chair of ACMSIGCOMM 2018.

6/03/2019 11:30

Shay Vargaftik

Viterbi Faculty of Electrical Engineering, Technion

On:

Dis-aggregated programmable switching and distributed load balancing

Nowadays, as computer networks continuously increase in size and speed, making efficient use of compute and network resources becomes exceptionally challenging. At the heart of this challenge are scheduling and load balancing techniques that are expected to optimize resource usage and, in turn, minimize costs. In this talk we shall address two such emerging network challenges.
In the first part of the talk we shall present dRMT (disaggregated Reconfigurable Match-Action Table), a new architecture for programmable switches. dRMT overcomes two significant restrictions of RMT, the predominant pipeline-based architecture for programmable switches, namely: (i) table memory is local to an RMT pipeline stage,
implying that memory not used by one stage cannot be reclaimed by another, and (ii) RMT is hard-wired to always sequentially execute matches followed by actions as packets traverse pipeline stages. dRMT resolves both issues by disaggregating the memory and compute resources of a programmable switch. At the heart of dRMT is a new scheduling technique
that can execute programs at line rate with fewer processors compared to RMT, and it avoids performance cliffs when there are not enough processors to run a program at line rate.
In the second part of the talk we shall focus on load balancing. Specifically, we will introduce “Local-Shortest-Queue” (LSQ) which is a new family of load balancing algorithms. Roughly speaking, in LSQ each load balancer (a.k.a. dispatcher) keeps a local view of the server queue lengths and routes jobs to the shortest among them. Communication is used only to update
the local views and make sure that they are not too far from the real queue lengths. We will show how LSQ policies significantly outperform well-known low-communication policies that have been considered for the multi-dispatcher scenario, such as JSQ(d) and JIQ, while using a similar communication budget.

* Ph.D. Advisors: Prof. Ariel Orda and Prof. Isaac Keslassy.

27/02/2019 11:30

Tanya Brokhman

Viterbi Faculty of Electrical Engineering, Technion

On:

GAIA: An OS Page Cache for Heterogeneous Systems

We propose a principled approach to integrating GPU memory with an OS page cache. We design GAIA, a weakly-consistent page cache that spans CPU and GPU memories. GAIA enables the standard mmap system call to map files into the GPU address space, thereby enabling data-dependent GPU accesses to large files and efficient write-sharing between the CPU and GPUs. Under the hood, GAIA (1) integrates lazy release consistency among physical memories into the OS page cache while maintaining backward compatibility with CPU processes and unmodified GPU kernels; (2) improves CPU I/O performance by using data cached in GPU memory, and (3) optimizes readahead prefetcher to support accesses to caches in GPUs.
We prototype GAIA in Linux and evaluate it on NVIDIA Pascal GPUs. We show up to 3 speedup in CPU file I/O and up to 8 in unmodified realistic workloads such as Gunrock GPU-accelerated graph processing, image collage, and microscopy image stitching. 

Joint Work with Pavel Lifshits, Mark Silberstein.

12/12/2018 11:30Meyer 861

Oak: Off Heap Allocated Keys for Big Data Analytics

Hagar Meir

Oak is a scalable concurrent key-value map designed for real-time big data analytics. Oak offloads the data from the virtual machine heap in managed-memory languages like Java, thereby scaling to order-of-magnitude larger volumes of data than the current state-of-the-art and reducing garbage collection overheads to essentially zero. Oak is optimized for big keys and values and for frequent incremental maintenance of existing values, as prevalent in streaming analytics use cases. To this end, it adopts a zero-copy approach to data update and retrieval, e.g., through concurrent update-in-place. Oak’s API is similar to that of Java’s ConcurrentNavigableMap with adjustments for efficient zero-copy implementation. It provides strong (atomic) semantics for read, write, and various read-modify-write operations, such as compute (in-situ update) and put-if-absent, as well as (non-atomic) ascending and descending iterators.

9/12/2018 11:30Meyer 861

Network Analytics at Scale

Shir Landau-Feibish

The drastically growing scale of today’s networks makes managing them a significant challenge. Timely detection and response to events such as congestion, failure, and attack are crucial for proper network operation and require analyzing voluminous traffic quickly and accurately. To do so, we must devise new techniques for network monitoring and control, that identify and fix problems when and where they happen. In this talk I will present two results, that address detection and mitigation of common network problems, the first is queue buildup and the second is zero-day attacks.

First, I will present a system for real-time detection of queue buildup in programmable switches. Short-lived traffic surges can cause periods of unexpectedly high queue utilization and may lead to packet loss. We will present a system that detects congestion as it forms, and identifies the flows causing queue buildup within the data plane using P4. We show that our system accurately targets the responsible flows at the sub-millisecond level. This is a joint work with Xiaoqi Chen, Yaron Koral, Jennifer Rexford and Ori Rottenstreich.

Second, I will present a system for automatic signature extraction for zero-day attacks.
Attack signatures, which include one or more strings (or regular expressions) common to packets in an attack, are usually generated a-priori and then used in intrusion detection systems to identify certain content in future traffic. However, existing signatures can not assist in detecting yet unknown attacks. We present a system for automatic extraction of signatures for zero-day Distributed Denial of Service (DDoS) attacks. Our system finds popular strings of variable length in a set of packets, using the classic Space-Saving heavy-hitters algorithm as a building block. This is a joint work with Yehuda Afek and Anat Bremler-Barr.

28/11/2018 11:30Meyer 861

New Paradigms for Cryptographic Hashing

Ilan Komargodski

Cryptographic hash functions are the basis of many important and far reaching results in cryptography, complexity theory, and beyond. In particular, hash functions are the primary building block of fundamental applications like digital signatures and verifiable computation, and they are the tool underlying the proofs-of-work which drive blockchains.

Because of the central role of cryptographic hash functions in both theory and practice, it is crucial to understand their security guarantees, toward basing applications on the minimal possible notion of security. Indeed, there are many ways to formalize the security requirement of a hash function; each way is sufficient for different applications. Also, there are many candidate hash functions, offering various trade-offs between security, efficiency, and other desirable properties.

In this talk, I will present an application [Komargodski-Naor-Yogev, FOCS 2017] of a relatively weak notion of hashing (collision resistance) that goes well beyond cryptography into a fundamental problem in the intersection of complexity theory and combinatorics — the Ramsey problem. This will lead us to new emerging aspects of hash functions, including relaxed security notions and their applications, addressing a recent attack on SHA-1. I will conclude with several exciting open problems and challenges.

21/11/2018 11:30Meyer 861

Vulnerability, Security and Privacy at the Edge of Computing

Anupam Chattopadhyay

School of Computer Science and Engineering Nanyang Technological University, Singapore

The steady rise of intelligence and autonomy over a scale of distributed, connected and
smart components is heralding the era of Internet of Intelligence. Without adequate
safeguards in place, such components can lead to terrible consequences. In this talk, I will
discuss three aspects – namely, vulnerability, security and privacy of such edge computing
scenarios. The vulnerability aspect will show how a simple setup can derail sophisticated
unmanned aerial vehicles. The security angle will be discussed through a novel
countermeasure, which, via extremely fine grained profiling and machine learning
techniques is able to thwart the latest ransomware attacks. Lastly, the privacy perspective
will be presented through an innovative compression technique, realised on top of a
video / image processing platform.

14/11/2018 11:30Meyer 1061

Caribou: Smart Distributed Storage for the Datacenter

Zsolt Istvan

Assistant Research Professor at the IMDEA Software Institute Madrid, Spain

Caribou: Smart Distributed Storage for the Datacenter
There is a widening gap in the datacenter between data growth and stagnating CPU performance. This gap limits our ability to solve more complex problems and prompts us to revisit both the architecture of servers and the ways we manage and process data. In my work, I aim to narrow this gap using specialization and HW/SW co-design. As a specific example, I will talk about building energy-efficient distributed storage for large-scale data processing applications. Most modern data-intensive applications are designed to run on tens or hundreds of nodes, often splitting them between “compute” and “storage” layers. This separation increases scalability but also introduces data movement bottlenecks. We show how these bottlenecks can be lifted without increasing the nodes’ power consumption by pushing down computation into storage nodes built using specialized hardware (FPGAs). Caribou implements data management, data processing, and replication for fault tolerance in a micro-server footprint. It delivers network-bound throughput and, even though it is based on specialized hardware, it can be efficiently shared by multiple tenants. Caribou is open-source and acts as a platform for exploring ideas related to, on the one hand, distributed data management using specialized hardware and, on the other hand, near-data processing in emerging data science workloads. Presenter bio: Zsolt Istvan is an Assistant Research Professor at the IMDEA Software Institute in Madrid, Spain. Before that, he earned his PhD in the Systems Group at ETH Zurich, Switzerland, working with FPGAs and distributed storage. In his research, he explores ideas around specialization as a way of lifting bottlenecks in distributed systems and databases.

30/5/2018 11:30Meyer 861

Space Bounds for Reliable Coded Storage, and Beyond

Alexander Spiegelman

EE Technion

The bulk of the talk will deal with space requirements of reliable storage algorithms in asynchronous distributed systems. A number of recent works have used codes in order to achieve a better storage cost than the well-known replication approach. However, a closer look reveals that they incur extra costs in certain scenarios. Specifically, if multiple clients access the storage concurrently, then existing asynchronous code- based algorithms may store a number of copies of the data that grows linearly with the number of concurrent clients. We will show that this is inherent. Given three parameters, (1) the data size – D bits, (2) the concurrency level – c, and (3) the number of storage node failures that need to be tolerated – f, we show a lower bound of omega(min(f,c)*D) bits on the space complexity of asynchronous distributed storage algorithms. Intuitively, this implies that the asymptotic storage cost is either as high as with replication, namely O(fD), or as high under concurrency as with the aforementioned code-based algorithms, i.e., O(cD). We further present a technique for combining erasure codes with replication so as to obtain the best of both. We present an adaptive f – tolerant storage algorithm whose storage cost is O(min(f, c)*D). Together, the results show that the space complexity of providing reliable storage in asynchronous distributed systems is teta(min(f,c) * D).

I will then briefly present some results on game theoretic analysis of multi-miner multi blockchain settings. We consider a model with coins and rational miners, in which each coin divides a reward to the miners that mine for it. The payoff of each miner is proportional to the miner’s power, and miners are free to change coins in order to increase their payoffs. We show that every game in this setting has an equilibrium, and moreover, every better respond dynamic leads to an equilibrium. Then, we also give a mechanism reward design to move the system between equilibriums.

Based on joint works with Yuval Cassuto, Gregory Chockler, Idit Keidar, and Moshe Tennenholtz.

Bio:
Alexander Spiegelman is a PhD candidate at the Electrical Engineering Faculty of the Technion, advised by Prof. Idit Keidar.

23/5/2018 11:30Meyer 861

RISC-V – Why a new CPU Architecture?

Oded Lempel

Mellanox Technologies

Who needs a new Instruction Set Architecture (ISA)? Architectures have reached some unspoken Truce through Markets Segment dominance (IA64 – PC and Server market, ARM – Mobile market). IA and ARM have ruled the PC/Server and Mobile markets, respectively, for years and have prevailed aggressive competitive assaults Some Markets are still pursuing a standard (ARM and Various DSPs – IoT market, GPGPU (Nvidia) and TPU (Google) – IA/ML applications) IoT and IA/ML segments have not yet matured and no standard has been established. They are attracting a lot of innovation that may yield a dominant standard architecture. Where is there room for a new ISA? What would lead the industry and market to move to a new market? We will focus on RISC-V, what it brings that is not available in existing Architectures, in which markets is this enough to create an Architecture change.

Bio:

16/5/2018 11:30Meyer 861

Security and Reliability Using OS

Noam Shalev

EE Technion

Computer systems have been developed tremendously over the past few years and as a result new challenges arise in the fields of security and reliability. In this talk I will present two approaches that utilize the operating system for providing solutions to urgent challenges in these fields. First, I will speak about the Core Surprise Removal (CSR) algorithm which explores the uncharted field of providing reliability in face of hardware faults and makes a unique use of Hardware Transactional Memory. Next, I will introduce WatchIT, a new container-based approach for protecting organizations from privileged insider threats, such as the renowned Edward Snowden case. WatchIT leverages the operating system namespace subsystem and alongside classification system and minor optional changes to the kernel proposes a comprehensive solution that was tested on a real organization.

Bio:
Noam Shalev is a PhD candidate at the Electrical Engineering Faculty of the Technion, advised by Prof. Idit Keidar.

9/5/2018 11:30Meyer 861

But Why Does It Work? A “Rational Protocol Design” Treatment of Bitcoin

Juan Garay

Texas A M University

As the first and most popular decentralized cryptocurrency to date, Bitcoin has ignited much excitement, not only for its novel realization of a central bank-free financial instrument, but also as an alternative approach to classical problems in distributed computing and cryptographic protocols, such as reaching consensus in the presence of misbehaving parties.

In this talk, after a brief introduction to the innovative and distributedly-maintained data structure known as the “blockchain,” we first present the first formalization of the Bitcoin core protocol, identifying its fundamental properties, and showing how a distributed public ledger can be built “on top” of them. This rigorous cryptographic treatment shows that such a ledger is robust if and only if the majority of the mining power is honest. Which brings us to the second part of the talk: Why then does Bitcoin work, given that the real-world system (the size of existing “mining pools,” in particular) does not necessarily adhere to this assumption? Using a mix of game theory and cryptography approach — a framework we call “Rational Protocol Design,” we show how natural incentives in combination with a high monetary value of Bitcoin can explain why Bitcoin continues to work even though majority coalitions are in fact possible.

This talk is based on joint works with C. Badertscher, A. Kiayias, N. Leonardos, U. Maurer, D. Tschudi and V. Zikas.

Bio:
Since the beginning of Fall ’17, Juan Garay is a full professor in Texas A M University’s Computer Science and Engineering Department. Previously, after receiving his PhD in Computer Science from Penn State, he was a postdoc at the Weizmann Institute of Science (Israel), and held research positions at the IBM T.J. Watson Research Center, Bell Labs, AT T Labs- Research, and Yahoo Research. His research interests include both foundational and applied aspects of cryptography and information security. He has published extensively in the areas of cryptography, network security, distributed computing, and algorithms; has been involved in the design, analysis and implementation of a variety of secure systems; and is the recipient of over two dozen patents. Dr. Garay has served on the program committees of numerous conferences and international panels—including co-chairing Crypto 2013 and 2014, the discipline’s premier conference, and has just been appointed Fellow of the International Association for Cryptologic Research (IACR).

7/3/2018 11:30861

Cascading Denial-of- Service Attacks on Wi-Fi Networks

David Starobinski

Boston University

We unveil the existence of a vulnerability in Wi-Fi (802.11) networks, which
allows an adversary to remotely launch a Denial-of- Service (DoS) attack that
propagates both in time and space. This vulnerability stems from a coupling effect
induced by hidden nodes. Cascading DoS attacks can congest an entire network and
do not require the adversary to violate any protocol. We demonstrate the feasibility of
such attacks through experiments with real Wi-Fi cards and extensive ns-3 simulations.
To gain insight into the root-causes of the attack, we model the network as a dynamical
system and analyze its limiting behavior and stability. The model accurately predicts
that a phase transition (and hence a cascading attack) is possible when the retry limit
parameter of Wi-Fi is greater or equal to 7. Last, we propose a counter-measure based
on the idea of optimizing the duration of packet transmissions. Specifically, we show
how to optimally set the packet duration so that, on the one hand, cascading DoS
attacks are avoided and, on the other hand, throughput is maximized.

Joint Work with Liangxiao Xin (Boston University) and Guevara Noubir (Northeastern
University).

Bio:
David Starobinski is a Professor of Electrical and Computer Engineering, Systems
Engineering, and Computer Science (by affiliation) at Boston University. He received his
Ph.D. in Electrical Engineering from the Technion-Israel Institute of Technology, in
1999. In 1999-2000, he was a post-doctoral researcher in the EECS department at UC
Berkeley. In 2007-2008, he was an invited Professor at EPFL (Switzerland).

Dr. Starobinski received a CAREER award from the U.S. National Science Foundation
(2002), an Early Career Principal Investigator (ECPI) award from the U.S. Department
of Energy (2004), the 2010 BU ECE Faculty Teaching Award, and best paper awards at
the WiOpt 2010 and IEEE CNS 2016 conferences. He is currently an Associate Editor
of the IEEE Transactions on Information Forensics and Security and a Faculty Fellow at
the U.S. DoT Volpe National Transportation Systems Center. His research interests are
in wireless networking, network economics, and cybersecurity.

31/1/2018 11:30Taub 301

Closing the Loop on Secure Operating System Design

Amit Levy

Stanford University

Secure system design should be guided by two principles: (1) system security should not impede third-party developers, who are often the main source of innovation, and (2) systems that secure third-party extensions also improve security by reducing the amount of specially-privileged first-party code.

Unfortunately, very few systems today adhere to these principles. This is not merely a result of poor system building. It is hard to design highly extensible systems that are both secure and useful. Moreover, the research community often fails to evaluate novel designs under real-world usage by actual practitioners. As a result, many promising research approaches remain difficult to adopt in practice.

I’ll describe Tock, an operating system for microcontrollers we designed with these principles in mind. I’ll discuss how we continuously evaluate Tock by engaging with practitioners, and how lessons from practitioners have fed back into the system’s design.

Bio:
Amit Levy is a PhD student at Stanford University, in the Stanford Secure Systems lab and Stanford Information Networks Group. He holds M.Sc’s in Computer Science from Stanford and University of Washington and a B.Sc in Computer Science and Economics from University of Washington.

Amit works with David Mazieres in the Secure Computer Systems group and Phil Levis in the Stanford Information Networks Group. Amit’s research involves building secure operating systems, distributed security, and networks, often with the aid of programming language. His work has been published in systems, security and programming language conferences including SOSP, OSDI, USENIX Security and ICFP. Recently, he’s been working on a new secure operating system for low-memory microcontrollers, called Tock.

24/1/2018 11:30Taub 301

Recent Developments in Linkography-Based Cyber Security

Robert Mitchell

Sandia National Laboratories

Cyber attacks on critical cyber systems are not decreasing in frequency or complexity. Aggressors choose the time and place of these engagements; protectors must identify, research and develop defensive techniques that provide an asymmetric advantage. A static, data-driven, preventative, automated defense is a losing strategy; an effective defense must be dynamic, behavioral, responsive and capitalize on a human in the loop. We propose human and machine performed linkography to detect, correlate, attribute and predict attacker behavior and present a moving, deceptive target. Recently, our team generated a technology transfer strategy for linkography based cyber security, proposed algorithms to extract and refine linkograph ontologies and subsessionize our input stream and completed our previous related machine learning work. Linkography has been in the literature for decades, and our investigation indicates it is an open, fertile topic for basic and applied cyber security research.

Bio:
Dr. Robert Mitchell is currently a member of technical staff at Sandia National Laboratories. He received the Ph.D, M.S. and B.S. from Virginia Tech. Robert served as a military officer for six years and has over 12 years of industry experience, having worked previously at Boeing, BAE Systems, Raytheon and Nokia. His research interests include game theory, linkography, moving target defense, computer network operations, network security, intrusion detection and cyber physical systems. Robert has published 23 peer reviewed articles.

17/1/2018 11:30Meyer 861

Redesigning Bitcoin’s fee market

Ron Lavi

Dept. of Industrial Engineering and Management, Technion

The security of the Bitcoin system is based on having a large amount of computational power in the hands of honest miners. Such miners are incentivized to join the system and validate transactions by the payments issued by the protocol to anyone who creates blocks. As new bitcoins creation rate decreases (halving every 4 years), the revenue derived from transaction fees start to have an increasingly important role. We argue that Bitcoin’s current fee market does not extract revenue well when blocks are not congested. This effect has implications for the scalability debate: revenue from transaction fees may decrease if block size is increased.

The current mechanism is a “pay your bid” auction in which included transactions pay the amount they suggested. We propose two alternative auction mechanisms: The Monopolistic Price Mechanism, and the Random Sampling Optimal Price Mechanism (due to Goldberg et al.). In the monopolistic price mechanism, the miner chooses the number of accepted transactions in the block, and all transactions pay exactly the smallest bid included in the block. The mechanism thus sets the block size dynamically (up to a bound required for fast block propagation and other security concerns). We show, using analysis and simulations, that this mechanism extracts revenue better from users, and that it is nearly incentive compatible: the profit due to strategic bidding relative to honest biding decreases as the number of bidders grows. Users can then simply set their bids truthfully to exactly the amount they are willing to pay to transact, and do not need to utilize fee estimate mechanisms, do not resort to bid shading and do not need to adjust transaction fees (via replace-by-fee mechanisms) if the mempool grows.

We discuss these and other properties of our mechanisms, and explore various desired properties of fee market mechanisms for crypto-currencies.

Bio:
Ron Lavi is an associate professor at Technion’s Industrial Engineering and Management department. His research focuses on subjects on the border of economics and computation, mainly in algorithmic game theory, auction theory, and the efficient design of economic mechanisms. He has a PhD in computer science from the Hebrew University. In the past, he was a consultant and an academic visitor to Google, Microsoft research, and Yahoo! Labs.

10/1/2018 11:30Meyer 861

Coded Caching for Content Distribution Networks

Barak Farbman

Dept. of Electrical Engineering, Technion

Caching is the most effective practice for reducing delays and congestion in networks aimed at content distribution. In this work we address the setup of multiple Internet service providers (ISPs) sharing a federated cache. When multiple ISPs with different demands share a cache, storage cost can be reduced when the cache is coded, that is, storing objects that are XORs of multiple content objects.

The talk will present our results on low-complexity methods for coded caching, where both computational complexity and communication cost are minimized. For the static-demand case we show how to lower the computational complexity to serve requests while guaranteeing optimal storage cost. For the dynamic-demand case (ISPs constantly changing their demands), we propose an architecture and algorithms to lower the communication cost required for updating the coded objects in the cache, while maintaining low storage cost.

Bio:
Barak Farbman is an M.Sc. student in the Electrical Engineering Department of the Technion, under the supervision of Prof. Yuval Cassuto. Barak received the B.Sc. degrees (cum laude) in Electrical Engineering and in Physics from the Technion in 2014. He is also an alum of the EMET program of the Technion’s EE Department.

3/1/2018 11:30Meyer 861

Side Channel Attacks on Implementations of Curve25519

Daniel Genkin and Yuval Yarom

University of Pennsylvania and University of Maryland; University of Adelaide

In recent years, applications increasingly adopt security primitives designed from the start with built-in side channel protection. A concrete example is Curve25519, which has been recently standardized in RFC-7748. Dealing away with obvious leakage sources such as key-dependent branches and memory accesses, RFC-7748 dictates that implementations should use a highly regular Montgomery ladder scalar-by-point multiplication, a unified, branchless double-and-add formula and a constant-time argument swap within the ladder. Moreover, as Curve25519 provides innate protection from small subgroup attacks, it is recommended that implementations do not validate the inputs, completing the Diffie-Hellman protocol irrespective of the input’s mathematical properties.

In this talk we demonstrate that the failure to perform input validation, combined with the unique mathematical structure of Curve25519, can be used to amplify side channel leakage from several RFC-7748 compliant implementations. As case studies, we investigate two RFC-7748 compliant implementations: libgcrypt and the reference implementation provided as part of the NaCl library. We show effective attacks on both of these implementations, recovering secret key material from just a single leakage trace.

Joint work with Daniel Genkin, Niels Samwel, Luke Valenta, and Yuval Yarom.

31/12/2017 13:30Meyer 1061

A Fast Interface for SGX Secure Enclaves

Ofir Weisse

University of Michigan

[Note unusual time and date – the talk is on a Sunday, 13:30!]

Secure execution technologies, such as Intel SGX, offer an attractive solution for protecting one’s private data in the public cloud environment. In this talk, we will explore how SGX mitigates various attack surfaces and what are the caveats of naively using the technology to protect applications. Specifically, we will discuss the performance implications of SGX on common applications and understand what are the new bottlenecks created by SGX, which may lead to a 5x performance degradation. We then describe an optimization mechanism, HotCalls, that provides a 13-27x speedup compared to the builtin mechanism supplied by SGX SDK. Overcoming the performance bottlenecks is not enough to construct a useful secure distributed system. We will talk about the missing pieces in SGX to manage multiple entities securely, and how can we fill in the gap.

Bio:
Ofir is a Ph.D. candidate at the University of Michigan. His current research focuses on the feasibility of secure execution in the cloud: Enabling low-cost security in the cloud environment, without compromising performance. His recent publications include HotCalls (ISCA 2017) and WALNUT (EuroS&P 2017). Ofir worked for Intel in Haifa as a security researcher in the SGX group. He received his Master’s in Computer Engineering from Tel-Aviv University and B.Sc from the Technion. His previous research focused on differential power analysis of cryptographic devices, which was published in CHES and HASP.

27/12/2017 11:30Taub 301

Online Detection of Effectively Callback Free Objects with Applications to Smart Contracts

Guy Golan-Gueta

VMware Research

Callbacks are essential in many programming environments, but drastically complicate program understanding and reasoning because they allow to mutate object’s local states by external objects in unexpected fashions, thus breaking modularity. The famous DAO bug in the cryptocurrency framework Ethereum, employed callbacks to steal $150M. We define the notion of Effectively Callback Free (ECF) objects in order to allow callbacks without preventing modular reasoning.

An object is ECF in a given execution trace if there exists an equivalent execution trace without callbacks to this object. An object is ECF if it is ECF in every possible execution trace. We study the decidability of dynamically checking ECF in a given execution trace and statically checking if an object is ECF. We also show that dynamically checking ECF in Ethereum is efficient and can be done online. By running the history of all execution traces in Ethereum, we were able to verify that virtually all existing contract executions, excluding the DAO or contracts with similar known vulnerabilities, are ECF. Finally, we show that ECF, whether it is verified dynamically or statically, enables modular reasoning about objects with encapsulated state.

This is a joint work with Shelly Grossman, Ittai Abraham, Yan Michalevsky, Noam Rinetzky, Mooly Sagiv and Yoni Zohar. The full paper will appear in POPL ’18.

Bio:
Guy Gueta is a Senior Researcher at VMware Research Group. His research interests include distributed systems, synchronization techniques, and programming languages. Recently, he has focused on consensus algorithms and execution models for Blockchain technologies.
Prior to VMware, Guy was a Researcher at Yahoo Labs, and a software architect in several large software projects.
Guy received a Ph.D. degree in Computer Science from Tel Aviv University in 2015.

01/11/2017 11:30Meyer 861

A 1,000x Improvement in Computer Systems using Current Fabs and Process

Zvi Or-Bach

MonolithIC 3D Inc.

For over 4 decades, the gap between computer processing speed and memory access has grown at about 50% per year, to more than 1,000x today. This provides an excellent opportunity to enhance the single-core system performance. An innovative 3D integration technology combined with re-architecting the integrated memory device is proposed to bridge the gap and enable a 1,000x improvement in computer systems. The proposed technology utilizes processes that are widely available and could be integrated in products within a very short time.

Bio:
Or-Bach is the world recognized expert in monolithic 3D technologies with over 100 patents in the space, the chairman of the 3D of IEEE S3S Conference, and is active as an invited speaker and tutorial instructor in the US, Korea, and Japan. He has extensive management experience including being CEO and in charge of R&D, sales, marketing, business development and other corporate functions for over 40 years. Or-Bach has been an active board member of technology companies for over 20 years and is currently Chairman of the Board for Zeno Semiconductors and VisuMenu. Or-Bach has a history of innovative development in fast-turn ASICs for over 20 years. His vision led to the invention of the first Structured ASIC architecture, the first single via programmable array, and the first laser-based system for oneday Gate Array customization.
Prior to MonolithIC 3D, Or-Bach founded eASIC in 1999 and served as the company’s CEO for six years. eASIC was funded by leading investors Vinod Khosla and KPCB in three successive rounds. Under Or-Bach’s leadership, eASIC won the prestigious EETimes’ 2005 ACE Award for Ultimate Product of the year in the Logic and Programmable Logic category and the Innovator of the Year Award and was selected by EE Times to be part of the “Disruptors – The people, products and technologies that are changing the way we live, work and play.” Earlier, Or-Bach founded Chip Express in 1989 (recently acquired by Gigoptix) and served as the company’s president and CEO for almost 10 years, bringing the company to $40M revenue and recognition for four consecutive years as a high-tech Fast 50 Company.
Zvi Or-Bach received his BSc degree (1975) cum laude in electrical engineering from the Technion – Israel Institute of Technology, and MSc (1979) with distinction in computer science, from the Weizmann Institute, Israel. He holds over 180 issued patents, primarily in the field of 3D integrated circuits and semi-custom chip architectures.

25/10/2017 11:30Meyer 1061

Secure Execution on Untrusted Platforms

Ofir Shwartz

Dept. of Electrical Engineering, Technion

Remote computing services (e.g., virtualization and cloud services) offer advantages to organizations and individuals, putting at their disposal enormous computing resources while permitting them to pay only for the resources actually used. Unfortunately, such environments are prone to attacks by hackers, adversarial users of the systems, or even the owner of the service. Such attacks may address the operating system, hypervisor, VMM, or even the hardware itself. It would therefore be extremely beneficial if one could ensure the security or their programs in such environments, as this would likely lead to a dramatic expansion of their use for applications ranging from research, through financial, and to medical systems. Specifically, the confidentiality of the code and data must be preserved, and tampering with those or with the sequence of execution must be detected.
Although prior art suggested various ideas and architectures, they are missing key feature for becoming practical, such as supporting existing applications, providing security without harming performance, and being scalable to many compute node.
In this work we present the Secure Machine (SeM), a CPU architecture extension for secure computing that supports vast range of systems, from single core compute nodes to parallel and distributed computing environments. Using static binary instrumentation, SeM supports existing applications (binaries). SeM’s performance overhead for the added security features is negligible. We therefore consider SeM as a major step towards practical secure computing.

09/08/2017 11:30Meyer 861

Enabling Secure Distributed Storage and Private Information Retrieval in the Presence of Adversaries with Side Information

Alex Sprintson

Electrical and Computer Engineering, Texas A&M University

The talk includes two parts. First, we consider the problem of designing repair-efficient distributed storage systems that are secure against a passive eavesdropper that can observe the contents of any subset of nodes of limited size. We present a universal framework for the design of coset-coding based outer codes that enables information-theoretic security properties for a broad class of storage codes. As case studies, we consider minimum storage regenerating codes with small repair degree, maximally recoverable codes, and locally repairable codes.

In the second part of the talk, we consider the problem of Private Information Retrieval (PIR) in the presence of an adversary with prior side information. This problem is motivated by practical settings in which the user can obtain side information opportunistically from other users or has previously downloaded some messages using classical PIR schemes. Our objective is to enable the user to retrieve the required message without revealing its identity while minimizing the amount of data downloaded from the servers. We establish lower bounds on the minimum rate of a data retrieval scheme for the single server scenario and present an optimal retrieval schemes that satisfy different privacy constraints.

Joint work with Swanand Kadhe, Brenden Garcia, Anoosheh Heidarzadeh, and Salim El Rouayheb

Bio:
Dr. Sprintson is a faculty member in the Department of Electrical and Computer Engineering, Texas A&M University, College Station. He received his B.S. degree (summa cum laude), M.S., and Ph.D. degrees in electrical engineering from the Technion in 1995, 2001, and 2003, respectively. From 2003 to 2005, he was a Postdoctoral Research Fellow with the California Institute of Technology, Pasadena. His research interests lie in the general area of communication networks with a focus on wireless network coding, distributed storage, and software defined networks. Dr. Sprintson received the Wolf Award for Distinguished Ph.D.students, the Viterbi Postdoctoral Fellowship, the TAMU College of Engineering Outstanding Contribution Award and the NSF CAREER award. He serves as an Associate Editor of the IEEE Transactions on Wireless Communications. He has been a member of the Technical Program Committee for the IEEE Infocom 2006–2017.

28/06/2017 11:30Meyer 861

Securing Internet Routing from the Ground Up

Michael Schapira

CS, Hebrew University of Jerusalem

The Internet’s communication infrastructure (TCP/IP, DNS, BGP, etc.) is alarmingly insecure, as evidenced by many high-profile incidents. I will illustrate the challenges en route to securing the Internet, and how these can be overcome, by focusing on the Internet’s, arguably, biggest security hole: the vulnerability of Internet routing to traffic hijacking attacks.

Bio:
Michael Schapira is an associate professor at the School of Computer Science and Engineering, the Hebrew University of Jerusalem. He is also the scientific co-leader of the Fraunhofer Cybersecurity Center at Hebrew University. His research interests focus on the protocols/mechanisms that make the Internet tick (e.g., for routing, traffic management). He is interested in the design and analysis of practical (Inter)network architectures and protocols with provable guarantees (failure-resilience, optimality, security, incentive compatibility, and beyond). He is also broadly interested in the the interface of computer science, economics, and game theory (e.g., dynamic interactions in economic and computational environments, incentive-compatible computation, computational auctions, and more).

20/06/2017 11:30Meyer 1007

Finding the Next Curves: Towards a Scalable Future for Specialized Architectures

Adi Fuchs

EE, Princeton

The end of CMOS transistors scaling marks a new era for modern computer systems. As the gains from traditional general-purpose processors diminish, researchers explore the new avenues of domain-specific computing. The premise of domain-specific computing is that by co-optimizing software and specialized hardware accelerators, it is possible to achieve higher performance per power rates.
In contrast to technology scaling, specialization gains are not sustainably scalable, as there is a limit to the number of ways to map a computation problem to hardware under a fixed budget. Since hardware accelerators are also implemented using CMOS transistors, the gains from specialization will also diminish in the farther future, giving rise to a “specialization wall”.
We explore the integration of emerging memory technologies with specialized hardware accelerators to eliminate redundant computations and essentially trade non-scaling CMOS transistors for scaling memory technology. We evaluated our new architecture for different data center workloads. Our results show that, compare to highly-optimized accelerators, we achieve an average speedup of 3.5x, save on average 43% in energy, and save on average 57% in energy-delay product.

Bio:
Adi Fuchs is a PhD Candidate in the EE Department at Princeton University working on novel computer architectures. His research explores the integration of new scaling technologies with existing computer systems and domain-specific accelerators. He earned his BSc and MSc degrees, cum laude and summa cum laude, respectively, both from the EE Department at Technion – Israel Institute of Technology

15/06/2017 11:30Meyer 1003

Memcomputing: a brain-inspired topological computing paradigm

Massimiliano Di Ventra

University of California San Diego

Which features make the brain such a powerful and energy-efficient computing machine? Can we reproduce them in the solid state, and if so, what type of computing paradigm would we obtain? I will show that a machine that uses memory to both process and store information, like our brain, and is endowed with intrinsic parallelism and information overhead – namely takes advantage, via its collective state, of the network topology related to the problem – has a computational power far beyond our standard digital computers. We have named this novel computing paradigm “memcomputing”. As an example, I will show the polynomial-time solution of prime factorization, the NP-hard version of the subset-sum problem and the Max-SAT using polynomial resources and self-organizing logic gates, namely gates that self-organize to satisfy their logical proposition . I will also demonstrate that these machines are described by a topological field theory and they compute via an instantonic phase, implying that they are robust against noise and disorder. The digital memcomputing machines that we propose can also be efficiently simulated, are scalable and can be easily realized with available nanotechnology components, and may help reveal aspects of computation of the brain.

Bio:
Massimiliano Di Ventra obtained his undergraduate degree in Physics summa cum laude from the University of Trieste (Italy) in 1991 and did his PhD studies at the Ecole Polytechnique Federale de Lausanne (Switzerland) in 1993-1997. He has been Visiting Scientist at IBM T.J. Watson Research Center and Research Assistant Professor at Vanderbilt University before joining the Physics Department of Virginia Tech in 2000 as Assistant Professor. He was promoted to Associate Professor in 2003 and moved to the Physics Department of the University of California, San Diego, in 2004 where he was promoted to Full Professor in 2006.
Di Ventra’s research interests are in the theory of electronic and transport properties of nanoscale systems, non-equilibrium statistical mechanics, DNA sequencing/polymer dynamics in nanopores, and memory effects in nanostructures for applications in unconventional computing and biophysics.
He has been invited to deliver more than 250 talks worldwide on these topics (including 10 plenary/keynote presentations, 9 talks at the March Meeting of the American Physical Society, 5 at the Materials Research Society, 2 at the American Chemical Society, and 2 at the SPIE). He has been Visiting Professor at the Technical University of Dresden (2015), University Paris-Sud (2015), Technical University of Denmark (2014), Ben-Gurion University (2013), Scuola Normale Superiore di Pisa (2012, 2011), and SISSA, Trieste (2012). He serves on the editorial board of several scientific journals and has won numerous awards and honors, including the NSF Early CAREER Award, the Ralph E. Powe Junior Faculty Enhancement Award, fellowship in the Institute of Physics and the American Physical Society.
Di Ventra has published more than 200 papers in refereed journals, co-edited the textbook Introduction to Nanoscale Science and Technology (Springer, 2004) for undergraduate students, and he is single author of the graduate-level textbook Electrical Transport in Nanoscale Systems (Cambridge University Press, 2008).

07/06/2017 11:30Taub 301

Revisiting Email Search

Ariel Raviv

Yahoo Research

With the rapid growth in machine-generated traffic and in storage capacities offered by Web mail services, our mailboxes get larger and larger. Search therefore becomes a critical mechanism in order to retrieve the specific messages we need. Unfortunately, during the past decade mail search has not observed the same pace of progress as Web search and is still perceived as a difficult and frustrating task. Only recently, mail search has become an emerging topic for research and has introduced new features and applications. In this talk, I will present current studies conducted over Yahoo Mail data, highlighting the characteristics of mail searchers and the types of queries they issue. I will then compare our finding to those in other domains, such as Web and Social Media, and present recent works in search results ranking as well as in query suggestion, published in top-tier data-science conferences.

Bio:
Ariel Raviv is a Senior Research Engineer at Yahoo Research, where he engages in mail search activities such as relevance ranking, top results selection, and query suggestion. Before joining Yahoo, Ariel was a Research Engineer at IBM Research in Israel, where he worked on recommendation systems, large-scale crawling, machine learning, search and classification tasks, as a member of the Information Retrieval and Social Technologies group. He received his B.Sc and M.Sc from the Dept. of Computer Science, Technion – Israel Institute of Technology, Haifa, where he conducted research under the supervision of Prof. Shaul Markovitch in the field of Artificial Intelligence and Natural Language Processing. Ariel is passionate about data mining and information retrieval, and has several related patents as well as several publications in leading conferences including AAAI, WWW, SIGIR, CIKM and RecSys.

24/05/2017 11:30Taub 301

cuSTINGER – A Sparse Dynamic Graph and Matrix Data Structure

Oded Green

Georgia Institute of Technology

Sparse data computations are ubiquitous in science and engineering. Two widely used applications requiring sparse data computations are graph algorithms and linear algebra operations such as Sparse Matrix-Vector Multiplication (SpMV). In contrast to their dense data counterparts, sparse-data computations have less locality and more irregularity in their execution – making them significantly more challenging to optimize. While there are several existing formats for sparse data representations, most of these formats are restricted to static data sets. This means that these representations and their implementations do not support changes to the original data. For graphs, this means that topology changes are not possible. For matrices, this means adding a perturbation to the original input is non-trivial.

In today’s talk, I will present cuSTINGER – a dynamic graph data structure for the GPU. This talk will partially focus on the internals of the data structure and some of its design considerations. cuSTINGER will be compared with Compressed Sparse Row (CSR) – a widely used static graph and matrix data representation for sparse inputs. This comparison will include a performance analyis of the data structure itself as well as higher level data analytics implemented using the data structure.
I will show that the new data structure has a relatively low-overhead in comparison to CSR. Yet unlike CSR, our new data structure is flexible and can grow for indefinite amount of time. I will then show that the data structure can process tens of millions of updates per second – a rate that exceeds many real world applications making it extremely useful.

In the second part of my talk, I will elaborate on the programming model for cuSTINGER. My research team’s big goal is to enable simple implementation of high-performance, scalable, dynamic graph analytics on the GPU for all users – including none-GPU experts. To limit the amount of GPU code that users need to write by themselves, cuSTINGER offers a small set of optimized operators for traversing graph vertices and edges. An additional design goal is to ensure that static graph algorithms, such as breath-first search and connected components, will also perform well in this library. I will show how these algorithms can be implemented in less than 15 lines of code through the use of these operators. I will then compare the performance of our cuSTINGER implementations with the static graph implementations in Gunrock – a highly tuned GPU graph library. Our current implementations are usually upto 30% slower or 20% faster than Gunrock, depending on the input and algorithm. Finally, I will show some performance results for a novel triangle counting algorithm for dynamic graphs which sustains millions of updates per second.

Bio:
Oded Green is a research scientist at the Georgia Institute of Technology (Georgia Tech) in Computational Sciences and Engineering, where he also received his PhD. Oded received both his MSc in electrical engineering and his BSc in computer engineering from Technion – Israel Institute of Technology. Prior to coming to Georgia Tech, Oded spent some time working as the Chief Operating Officer and a research scientist for ArrayFire (Atlanta, GA) where he was responsible for the company’s product management and research directions.

Oded’s research primarily focuses on improving the performance and scalability of large-scale graph analytics on dynamic graph using a wide range of high-performance computational platforms and by designing efficient algorithms and data structures.

17/05/2017 11:30Meyer 861

Moore with Less: Specializing Cores for the Cloud

Boris Grot

University of Edinburgh

Big data is revolutionizing the way we live, work, and socialize. This revolution is powered by datacenters built with commodity hardware technologies that are now running out of steam. In this talk, I will outline the disruptive challenges facing future computing systems and will argue for specializing server hardware as the only way forward. The principal challenge for specialization lies in providing common-case efficiency benefits without losing the generality necessary to accommodate both legacy and future software needs in cloud infrastructures. As a step toward meeting this challenge, I will present our latest work on specializing server cores to meet both single-thread performance demands and maximize throughput under QoS constraints for collocated workloads.

Bio:
Boris Grot is an Assistant Professor in the School of Informatics at the University of Edinburgh. His research seeks to address efficiency bottlenecks and capability shortcomings of processing platforms for big data. He is a MICRO Hall of Fame inductee and a recipient of the Google Faculty Research Award. Grot received his PhD in Computer Science from The University of Texas at Austin and spent two years as a post-doctoral fellow at the Parallel Systems Architecture Lab at EPFL.

10/05/2017 11:30Taub 301

The Challenges of Mining Machine-Generated Web Mail

Liane Lewin-Eytan

Yahoo Research

In the last decade, Web mail traffic has evolved, very much like regular snail mail, into being dominated by machine- generated messages. Some recent studies have verified that more than 90% of non-spam Web email is indeed generated by automated scripts. Although generated by machines, a large part of these messages include highly personal information, e.g. bank statements, travel plans, or shipment notifications. In this presentation, we will provide an overview on how machine generated traffic has changed the nature of Web Mail. Then, we will focus on the critical issue of privacy, which arises in various Web mail debugging and quality assurance activities applied to machine-generated messages.
Specifically, we study the problem of k-anonymization of mail messages in the realistic scenario of auditing mail traffic in a commercial Web mail service. Mail auditing is conducted by trained professionals, often referred to as “auditors”, who are shown messages that could expose personally identifiable information. We address the challenge of k-anonymizing such messages, focusing on machine generated traffic. We describe the model and process we applied over actual Yahoo mail traffic, and demonstrate that our methods are feasible at Web mail scale. Given the constantly growing concern of users over their email being scanned by others, we argue that it is critical to devise such algorithms that guarantee k-anonymity, and implement associated processes in order to restore the trust of mail users.

Bio:
Liane Lewin-Eytan is a Director at Yahoo Research, managing the Mail Mining Research team since 2014, with the mission of analyzing and modeling mail data, to get new insights and devise novel mail features and applications. Prior to this, during the years 2009-2013, Liane was a Research Staff Member at IBM Research in Israel, where she worked on smart grid networks, cloud networking and network virtualization. She received her B.Sc, M.Sc, and Ph.D. from the Dept. of Electrical Engineering, Technion – Israel Institute of Technology, Haifa in 1999, 2001 and 2008, respectively. Liane has served as PC member in several leading conferences during the past years including WSDM and CIKM, and has over 30 publications and US patents. Her publications during the last years relate to various fields in the Web Mail domain, including mail data clustering and classification, mail data anonymization, mail smart actions and mail search.

19/04/2017 11:30Taub 301

Omid: Low latency, Scalable and Highly-Available Transactions in Distributed Data Storage

Ohad Shacham

Yahoo Research

We present Omid – a low latency transaction processing service that powers web-scale production systems at Yahoo. Omid provides ACID transaction semantics on top of traditional key-value storage; its implementation over Apache HBase is open sourced as part of Apache Incubator. Omid can serve hundreds of thousands of transactions per second on standard mid-range hardware, while incurring minimal impact on the speed of data access in the underlying key-value store. Additionally, as expected from always-on production services, Omid is highly available. Omid is currently being integrated as Apache Phoenix transaction engine.
This talk combines several incarnations of Omid. The people that contributed to all/part of the presented material are: Aran Bergman, Edward Bortnikov, Yoni Gottesman, Eshcar Hillel, Igor Katkov, Idit Keidar, Ivan Kelly, Matthieu Morel, Sameer Paranjpye and Francisco Perez-Sorrosal.

Bio:
Ohad Shacham is Senior Research Scientist at Yahoo. He works on scalable big data and search platforms. Most recently, he focused on extending the Omid transaction processing system with high availability and scalability features. Ohad received his PhD in concurrent software verification from the Computer Science department at Tel-Aviv University, under the supervision of Mooly Sagiv and Eran Yahav. Prior to Yahoo, Ohad was a Research Staff Member at IBM Research and a Software Engineer at Intel.

05/04/2017 11:30Meyer 1061

On Routing Games and Net Neutrality

Ziv Berkovich

EE, Technion

Net neutrality is getting major attention these days, as it is at the crossroads between technology, economics and regulation. Discussions and new net neutrality rules and remedies are commonplace, all as a part of the attempt to find a balance between the need to follow common-carriage principles and the desire to provide different QoS by selectively blocking, slowing or providing faster connection tracks to web traffic of different customers with different applications by the ISPs.

The option of a selective blockage of certain users can be judged also under social welfare considerations. The information about which restrictions reduce the total system delay (or cost) from one hand, and which bounds lead to deterioration of the overall performance from the other, can be a significant factor in the considerations of the network administrator (e.g., limit trucks from using a certain road can benefit the highway system users and can be an acceptable action of the authorities). The restriction can be derived in order to express QoS and can be imposed on different groups of players each time.

In this talk, our goal is to analyze the effect of constraints over the network users in order to explore the potential improvement in system’s/users’ routing delay (cost) and to understand our limits in trying to benefit the overall system welfare. First, we will introduce the conceptual framework under which we formulate our research (Nonatomic / Atomic splittable routing games on a parallel links system). Then, we will show the users (and the system) behavior under prede?ned restrictions, that limit a group of users only to a sub graph of allowed links (for several performance functions and different types of restrictions). We will reach conclusions whether such restriction can or cannot improve the overall system cost. This analysis will provide us a convenient measurable tool to improve the system cost by limiting some system players only to specific sets of links.

Bio:

29/03/2017 11:30Meyer 861

Revisiting Virtual Caches

Guri Sohi

University of Wisconsin-Madison

Virtual caches have been around for several decades. They have several advantages in performance and energy efficiency, but have not been used in ubiquitous commercial designs because of problems due to synonyms. To revisit the problem and come up with a practical design, we start with a study of the temporal behavior characteristics of synonyms in several benchmark programs. Exploiting these characteristics we propose a practical virtual cache design with dynamic synonym remapping (VC-DSR) and then evaluate the effectiveness in a CPU context. We also study the application and effectiveness of virtual caches in a GPU context. In this talk we will present the empirical observations than underlie the proposal for VC-DSR, present the design and its effectiveness in a general CPU context. We also present a design to use virtual caches (both L1 and L2) in an integrated GPU context and show how it can significantly reduce the overheads of address translation.

Bio:
Guri Sohi has been a faculty member at the University of Wisconsin-Madison since 1985 where he currently a John P. Morgridge Professor and a Vilas Research Professor. Sohi’s research has been in the design of high-performance microprocessors and computer systems. Results from his research can be found in almost every high-end microprocessor in the market today. He has received the 1999 ACM SIGARCH Maurice Wilkes award, the 2011 ACM/IEEE Eckert-Mauchly Award, and the 2016 IEEE B. Ramakrishna Rau Award. At the University of Wisconsin he was selected as a Vilas Associate in 1997, awarded a WARF Kellett Mid-Career Faculty Researcher award in 2000, a WARF Named Professor in 2007, and a Vilas Research Professor in 2015. He is a Fellow of both the ACM and the IEEE and is a member of the National Academy of Engineering.

22/03/2017 11:30Taub 301

Making Network Functions Software-Defined

Yotam Harchol

VMware Research

OpenBox is a framework that makes network functions (NFs) software-defined by decoupling their control plane from their data plane, similarly to SDN solutions that only address the network’s forwarding plane. The OpenBox framework consists of a logically-centralized controller, data plane instances, and a communication protocol between them. User-defined NF applications are programmed on top of the northbound API of the controller. We present an extensible and highly modular SDN protocol that allows the controller to define packet processing goals in the data plane. Based on the observation that different NFs carry out similar processing steps, OpenBox reuses processing steps to reduce the overall processing and by that to improve performance and reduce resource consumption. OpenBox is designed to support both hardware and software packet processors, as well as smart NF placement, NF scaling, and multi-tenancy.

Bio:
Yotam Harchol is a Post-Doctoral researcher at the VMware Research Group. He completed his PhD studies at the Hebrew University of Jerusalem, Israel. His research is focused on improved performance and security of network functions and middleboxes.

18/01/2017 14:30Taub 301

Leveraging RDMA for Strongly Consistent Replication at Large Scale

Ken Birman

Cornell University

My work focuses on ways of replicating data in demanding settings, most recently the cloud. The cloud is a setting where copying information and replicating data or computation is common, yet it remains difficult to actually create new applications that leverage replication. Moreover, the existing libraries are very slow.

I’ll present Derecho, a blazingly fast C++ library for creating scalable data replication solutions. Derecho has a remarkably simple API and very low overheads, shifting as much work as possible from runtime to compile time. We separate control plane from data plane, sending data on RDMA over a multicast overlay, then using an RDMA-based distributed shared memory to coordinate delivery so as to ensure total ordering, data persistency or other desired properties such as virtually synchronous membership management. The overall framework is minimalist, yet can support complex applications, including ones that have subgroup structures or that shard data within larger sets of nodes. Derecho consistency subsumes both Paxos and virtual synchrony multicast into a single model.

Performance is very satisfying: As an atomic multicast, Derecho is tens of thousands of times faster than commonly used libraries, and as Paxos, it beats prior solutions by large factors while scaling far better than any existing Paxos solution.

If time permits, I’ll also say a few words about proofs. Although the protocols used in Derecho are based on ones familiar from decades of work on state machine replication, albeit with some novel twists, the new monotonic formulation used to decouple the control plane from the data plane within Derecho is of deeper interest. This approach turns out to simplify our protocols, seems to support simpler proofs, and also highlights the ties between our work and older results on knowledge logics.

Bio:
Ken Birman has been a systems researcher and faculty member at Cornell University since getting his PhD from UC Berkeley in 1981. He is best known for work on virtually synchronous process group computing models (an early version of what has become known as the Paxos family of protocols), and his software has been widely used. The Isis Toolkit that Ken built ran the NYSE for more than a decade, and is still used to operate many aspects of the communication system in the French air traffic control system. A follow-on named Vsync is widely used in graduate courses that teach distributed computing. This talk is based on his newest system, called Derecho.

18/01/2017 11:30Taub 401

Crowd Mining: a framework for mining the knowledge of web users

Yael Amsterdamer

Bar Ilan University

Crowd Mining is concerned with identifying significant patterns in the knowledge of the crowd, capturing, e.g., habits and preferences, by posing internet users with targeted questions. To account for jointly processing the crowd answers and available knowledge bases, and for user interaction and optimization issues, crowd mining frameworks must employ complex reasoning, automatic crowd task generation and crowd member selection. In this talk I will present the unique challenges in the crowd mining setting, and describe our solution in the form of an end-to-end system.

Bio:
Dr. Yael Amsterdamer is a senior lecturer at the Department of Computer Science, Bar Ilan University. Her research interests include the management of data obtained via Web-based platforms, such as crowdsourced data, online knowledge bases and social networks. Previously, Yael has been a visiting scholar of the Computer and Information Science Department at UPenn (Philadelphia, PA), and at the INRIA institute (Paris, France). She received her PhD from Tel Aviv University under the supervision of Prof. Tova Milo. In 2016 Yael received the Alon Fellowship for Outstanding Young Researchers. During her PhD, she won the Wolf Foundation Scholarship and the Dan David Prize for outstanding PhD students.

11/01/2017 11:30Meyer 861

Distributed and Privacy Preserving Planning

Ronen Brafman

Ben-Gurion University

Classical AI planning is concerned with the following problem: Given a deterministic system, an initial system state, and a goal condition, find a sequence of actions that transforms the system from its initial state to a state that satisfies the goal condition. It was originally conceived in order to make robots autonomous, and has numerous applications. A simple and natural extension of classical planning is one where there are multiple agents, each with its own set of actions, that are trying to, cooperatively, transform the system into the goal state. In this talk I will discuss this model, two general techniques for solving it in a distributed setting, how these ideas can help us formulate algorithms that allow agents to cooperate while maintaining the privacy of certain information, and the multi-agent model can help us solve some single-agent problems efficiently.

Parts of this work are joint with Carmel Domshlak, Raz Nissim, Shlomi Maliah, and Guy Shani

28/12/2016 11:30Meyer 861

Secure Computation in Hostile Environments

Daniel Genkin

University of Pennsylvania and University of Maryland

Computer systems are everywhere, often controlling critical processes and containing sensitive secret information. However, the ubiquitous nature of computer systems also means that such systems often operate in hostile environments where they are subjected to various attacks by adversarial parties. Even if the system’s security is theoretically proven under some set of assumptions, when faced with real-word attack scenarios, many theoretical assumptions become flaky, inaccurate and often completely incorrect.

In this talk I will present two cases for this gap between security theory and security practice:

* Utilizing unintentional and abstraction-defying side-channel leakage from physical computing devices in order to extract secret cryptographic keys and the relation of these attacks to leakage resilient cryptography.

* Constructing and deploying secure computation schemes for arbitrary C programs.

The talk will discuss cryptographic techniques and will include live demonstrations.

27/12/2016 11:30Taub 301

Temporal planning: towards highly utilized clouds

Ishai Menache

Microsoft Research, Redmond

Existing resource management frameworks for large scale cloud systems leave unresolved the problematic tension between high resource utilization and job’s performance predictability – respectively coveted by operators and users. In this talk, I will present recent efforts to resolve this tension through temporal planning: unlike popular scheduling and routing schemes, we propose mechanisms that plan the resource allocations into future time steps. Intuitively, such planning allows the operator to pack the cloud more densely, while offering performance SLAs to users. I will describe two recent systems that incorporate forms of temporal planning: (i) Morpheus – a cluster resource management system that offers automated SLAs to customers while reducing the cluster footprint; and (ii) Pretium – a combined traffic engineering and pricing framework for WAN bandwidth.

Bio:
Ishai Menache is a researcher in Microsoft Research, Redmond. He received his PhD from the Technion – Israel Institute of Technology. Subsequently, he was a postdoc at the Laboratory for Information and Decision Systems in MIT. Ishai’s research focuses on developing large-scale resource management and optimization frameworks for datacenters. More broadly, his areas of interest include systems and networking, algorithms and machine learning

21/12/2016 11:30Meyer 861

Challenges in Modeling, Optimization and Control of Energy Networks (Transmission and Distribution of Power and Natural Gas)

Dr. Chertkov Michael

Los Alamos National Lab (LANL)

Challenges in simulation, optimization and control of natural gas transmission systems and their coupling to power transmission systems are reviewed in this presentation describing research on the subject by the Grid Science Team at LANL. In this presentation I will describe opportunities but also challenges emerging in view of new dependencies between power and natural gas regional, national, and international systems. The availability of natural gas and the need for cleaner electric power have prompted widespread installation of gas-fired power plants and caused electric power systems to depend heavily on reliable gas supplies. The use of gas generators for base load and reserve generation causes high intra-day variability in withdrawals from high pressure gas transmission systems, which leads to gas price fluctuations and supply disruptions that affect electric generator dispatch and threaten the security of power and gas systems. The new situation sets up new problems for optimization dispatch schedule and gas compressor protocols which need to be compared with the status quo solutions. Some early work on this emerging subject will be discussed. In this talk I will also attempt to give a broader overview of variety of questions, methods and solutions from physics (of electric and gas flows), statistics, applied mathematics, optimization, control, machine learning and other theoretical engineering disciplines which are expected to play significant role in this exciting area of applied research.

Bio:
Dr. Chertkov’s areas of interest include statistical and
mathematical physics applied to energy and communication networks,
machine learning, control theory, information theory, computer
science, fluid mechanics and optics. Dr. Chertkov received his
Ph.D. in physics from the Weizmann Institute of Science in 1996,
and his M.Sc. in physics from Novosibirsk State University in 1990.
After his Ph.D., Dr. Chertkov spent three years at Princeton
University as a R.H. Dicke Fellow in the Department of Physics. He
joined Los Alamos National Lab in 1999, initially as a J.R.
Oppenheimer Fellow in the Theoretical Division. He is now a
technical staff member in the same division. Dr. Chertkov has
published more than 150 papers in these research areas. He is an
editor of the Journal of Statistical Mechanics (JSTAT), associate
editor of IEEE Transactions on Control of Network Systems, member
of the Editorial Board of Scientific Reports (Nature Group), a
fellow of the American Physical Society (APS) and a senior member
of IEEE.

14/12/2016 11:30Taub 401

Resampling with Feedback – A New Paradigm of Using Workload Data for Performance Evaluation

Dror Feitelson

Computer Science, Hebrew University

Reliable performance evaluations require representative workloads. This has led to the use of accounting logs from production systems as a source for workload data in simulations. I will survey 20 years of ups and downs in the use of workload logs, culminating with the idea of resampling with feedback. It all started with the realization that using workload logs directly suffers from various deficiencies, such as providing data about only one specific situation, and lack of flexibility, namely the inability to adjust the workload as needed. Creating workload models solves some of these problems but creates others, most notably the danger of missing out on important details that were not recognized in advance, and therefore not included in the model. Resampling solves many of these deficiencies by combining the best of both worlds. It is based on partitioning the workload data into basic components (e.g. the jobs contributed by different users), and then generating new workloads by sampling from this pool of basic components. This allows analysts to create multiple varied (but related) workloads from the same original log, all the time retaining much of the structure that exists in the original workload. However, resampling should not be applied in an oblivious manner. Rather, the generated workloads need to be adjusted dynamically to the conditions of the simulated system using a feedback loop. Resampling with feedback is therefore a new way to use workload logs which benefits from the realism of logs while eliminating many of their drawbacks. In addition, it enables evaluations of throughput effects that are impossible with static workloads.

Bio:
Dror Feitelson is a professor of Computer Science at the Hebrew University of Jerusalem, where he has been on the faculty of the Rachel and Selim Benin School of Computer Science and Engineering since 1995. His research emphasizes experimental techniques and real-world data in computer systems performance evaluation, and more recently also in software engineering. Using such data he and his students have demonstrated the importance of using correct workloads in performance evaluations, identified commonly made erroneous assumptions that may call research results into question, and developed methodologies to replace assumptions with real data. Other major contributions include co-founding the JSSPP series of workshops (now in its 20th year), establishing and maintaining the Parallel Workloads Archive (which has been used in about a thousand papers), and a recent book on Workload Modeling published by Cambridge University Press in 2015.

7/12/2016 11:30Meyer 861

From Theory to Practice: The Actual Outcome of Two ‘Somewhat Disjoint’ Network

Jose Yallouz

Electrical Engineering, Technion and Intel

Network Function Virtualization (NFV) is a novel paradigm that enables flexible and scalable implementation of network services on cloud infrastructure, while Network Survivability is an traditional well-studied subject for maintaining network service continuity in the presence of failures. This talk will address these two important subjects through the results of two recent papers, namely “Optimal Link-Disjoint Node-‘Somewhat Disjoint’ Paths”[1] and “The Actual Cost of Software Switching for NFV Chaining”[2].

In [1], we investigate a crucial research problem in this context is the identification of suitable pairs of disjoint paths. Here, “disjointness” can be considered in terms of either nodes or links. Accordingly, several studies have focused on finding pairs of either link or node disjoint paths with a minimum sum of link weights. In this study, we investigate the gap between the optimal node-disjoint and link disjoint solutions. Specifically, we formalize several optimization problems that aim at finding minimum-weight link-disjoint paths while restricting the number of its common nodes. We establish that some of these variants are computationally intractable, while for other variants we establish polynomial-time algorithmic solutions. Finally, through extensive simulations, we show that, by allowing link-disjoint paths share a few common nodes, a major improvement is obtained in terms of the quality (i.e., total weight) of the solution.

In [2], we conduct an extensive and in-depth evaluation that examines the impact of service chaining deployments on Open vSwitch – the de facto standard software switch for cloud environments. We provide insights on network performance metrics such as latency, throughput, CPU utilization and packet processing, while considering different placement strategies of a service chain. We then use these insights to provide an abstract generalized cost function that accurately captures the CPU switching cost of deployed service chains. This cost is an essential building block for any practical optimized placement management and orchestration strategy for NFV service chaining.

[1] Jose Yallouz, Ori Rottenstreich, Peter Babarczi, and Ariel Orda, “Optimal Link-Disjoint Node-‘Somehow Disjoint’ Paths”, IEEE ICNP ’16, Singapore, November 2016.
[2] Marcelo Caggiani Luizelli, Danny Raz, Yaniv Saar and Jose Yallouz, “The Actual Cost of Software Switching for NFV Chaining”, accepted to IEEE IM ’17.

Bio:
Jose Yallouz received a Ph.D. and a B.Sc. from the Electrical Engineering department of the Technion at 2016 and 2008, respectively. He recently joined the Core Architecture team at Intel Corporation. He was a recipient of the Israel Ministry of Science Fellowship in Cyber and advance Computing award. Among his past activities were the organization of the CeClub seminar and the Netmeeting forum. He is mainly interested in computer networks, computer architecture and compilers.

30/11/2016 11:30Taub 401

Practical Plausibly Deniable Encryption through Low-Level Flash Behaviors

Aviad Zuck

Computer Science, Technion

Users of solid-state disks and mobile devices may benefit from the ability to hide sensitive data in a manner that disallows powerful adversaries from detecting that data has been hidden. In this talk I present a new technique to achieve this goal by manipulating the voltage level of randomly selected flash cells to encode two bits (rather than one), such that one bit is “public” and the other is private. Our technique leverages the inherent differences between individual flash chips and the inherent noisiness of flash cell voltage levels. We demonstrate that our hidden data and underlying voltage manipulations go undetected by support vector machine based supervised learning. Our scheme also experiences low error rates that make the data recoverable months after being stored. Compared to prior work, our technique provides 24x and 50x higher encoding and decoding throughput and doubles the capacity, while being 37x more power efficient. This research project, supported by the US-Israel Binational Science Foundation (BSF) and US National Science Foundation (NSF), is a collaborative effort with researchers from Technion, Caltech, and University of North Carolina.

Bio:
Aviad is a post-doc at Technion’s Computer Science Lab, working with Prof. Dan Tsafrir. His research interests are in the intersection of new storage technologies and data security. Aviad received his M.Sc. and a Ph.D. from Tel Aviv University, done under the supervision of Prof. Sivan Toledo, and has worked over the years in several leading storage groups in industry, including IBM, Microsoft, and PMC-Sierra.

9/11/2016 11:30CS, Taub 401

Breaking the ISA Barrier in Modern Computing

Ashish Venkat

University of California, San Diego

On-chip heterogeneity has been shown to be an effective mechanism to improve execution efficiency for general purpose and embedded computing. Existing heterogeneous designs either feature a single ISA or multiple ISAs that statically partition work. In this talk, PhD candidate from UC San Diego Ashish Venkat, will describe his research that enables programs to cross a heretofore forbidden boundary — the ISA. In particular, Venkat will describe a compiler and runtime strategy for swift and seamless execution migration across diverse ISAs. Early design space exploration from Venkat’s research shows that harnessing ISA diversity provides substantial performance gains and energy savings by allowing a program to freely migrate across heterogeneous-ISA cores during different phases of its execution, and equips chip architects with finer design choices that enable more efficient ISA-microarchitecture co-design. In addition, Venkat will also briefly discuss his recent work that demonstrates the immense security potential of cross-ISA migration to thwart buffer overflow based attacks such as Return-Oriented Programming. Finally, he will briefly touch upon his ongoing collaboration at HRL under the EU OPERA project. which plans to leverage the aforementioned techniques to explore the potential efficiency benefits of ISA-heterogeneity in a datacenter environment.

Bio:
Ashish Venkat is a PhD Candidate in the Computer Science and Engineering department at UC San Diego, where he is a member of the High Performance Processor Architecture and Compilation lab, directed by Prof. Dean Tullsen. His research interests are in computer architecture and compilers, especially in instruction set design, processor microarchitecture, binary translation, code generation, and their intersection with computer security and machine learning. His work on Heterogeneous-ISA Chip Multiprocessors has been published at ISCA and ASPLOS.

6/9/2016 15:00EE 1061

Progress in automatic GPU compilation and why you want to run MPI on your GPU.

Torsten Hoefler

ETH Zurich

Auto-parallelization of programs that have not been developed with parallelism in mind is one of the holy grails in computer science. It requires understanding the source code’s data flow to automatically distribute the data, parallelize the computations, and infer synchronizations where necessary. We will discuss our new LLVM-based research compiler Polly-ACC that enables automatic compilation to accelerator devices such as GPUs. Unfortunately, its applicability is limited to codes for which the iteration space and all accesses can be described as affine functions.

In the second part of the talk, we will discuss dCUDA, a way to express parallel codes in MPI-RMA, a well-known communication library, to map them automatically to GPU clusters. The dCUDA approach enables simple and portable programming across heterogeneous devices due to programmer-specified locality. Furthermore, dCUDA enables hardware-supported overlap of computation and communication and is applicable to next-generation technologies such as NVLINK. We will demonstrate encouraging initial results and show limitations of current devices in order to start a discussion.

Bio:
Torsten is an Assistant Professor of Computer Science at ETH Zurich, Switzerland. Before joining ETH, he led the performance modeling and simulation efforts of parallel petascale applications for the NSF-funded Blue Waters project at NCSA/UIUC. He is also a key member of the Message Passing Interface (MPI) Forum where he chairs the “Collective Operations and Topologies” working group.
Torsten won best paper awards at the ACM/IEEE Supercomputing Conference SC10, SC13, SC14, EuroMPI’13,HPDC’15, HPDC’16, IPDPS’15, and other conferences. He published numerous peer-reviewed scientific conference and journal articles and authored chapters of the MPI-2.2 and MPI-3.0 standards. He received the Latsis prize of ETH Zurich as well as an ERC starting grant in 2015.
His research interests revolve around the central topic of “Performance-centric System Design” and include scalable networks, parallel programming techniques, and performance modeling. Additional information about Torsten can be found on his homepage at htor.inf.ethz.ch.

22/6/2016 11:30Taub 4

Refinement Reloaded, or – Deriving Divide-and-Conquer Dynamic Programming Algorithms by Transformation

Shachar Itzhaky

Massachusetts Institute of Technology

We introduce a framework allowing domain experts to manipulate computational terms in the interest of deriving better, more efficient implementations. It employs deductive reasoning to generate provably correct efficient implementations from a very high-level specification of an algorithm, and inductive constraint-based synthesis to improve automation. Semantic information is encoded into program terms through the use of refinement types.

In this paper, we develop the technique in the context of a system called Bellmania that uses solver-aided tactics to derive parallel divide-and-conquer implementations of dynamic programming algorithms that have better locality and are significantly more efficient than traditional loop-based implementations. Bellmania includes a high-level language for specifying dynamic programming algorithms and a calculus that facilitates gradual transformation of these specifications into efficient implementations. These transformations formalize the divide-and-conquer technique; a visualization interface helps users to interactively guide the process, while an SMT-based back-end certifies each step and takes care of low-level reasoning required for parallelism.

We have used the system to generate provably correct implementations of several algorithms, including some important algorithms from computational biology, and show that the performance is comparable to that of the best manually optimized code.

Bio:
Shachar is a post-doc at MIT’s Computer Science lab, working in the Computer-Aided Programming group headed by Prof. Armando Solar-Lezama. Shachar received his M.Sc. and a Ph.D. from Tel Aviv University, done under the supervision of Prof. Mooly Sagiv. Prior to that he was a proud alum of the Open University.

15/6/2016 11:30Meyer 861

Cooperative Game Theoretic Models for Network Interconnections

Gideon Blocq

Technion

Research on the application of game theory in the context of networking has focused on non-cooperative games, where the selfish agents cannot reach a binding agreement on the way they would share the infrastructure. Many approaches have been proposed for mitigating the typically inefficient operating points. However, in a growing number of networking scenarios selfish agents are able to communicate and reach an agreement. Hence, the degradation of performance should be considered at an operating point of a cooperative game, e.g., the Nash bargaining solution, core or nucleolus. Accordingly, in this talk our goal is to lay foundations for the application of cooperative game theory to fundamental problems in networking, with a focus on routing. Depending on the scenario, we will reach conclusions on how cooperation among agents affects their own performance as well as that of the system. Finally, we will discuss network design guidelines that follow from our findings.

PhD seminar, under the supervision of Prof. Ariel Orda.

Bio:
Gideon Blocq received his B.Sc. degree (cum laude) in electrical engineering from the Delft University of Technology, Delft, The Netherlands, in 2009. He is currently working towards his Ph.D. degree in electrical engineering at the Technion, Haifa, Israel. He has held internship positions at Microsoft Research, Cambridge, UK, and at Yahoo! Labs, Haifa, Israel. His research interests include the intersection of Computer Networks, Algorithms and Game Theory. Gideon Blocq is a recipient of the Google Europe Doctoral Fellowship in Computer Networking.

1/6/2016 11:30Taub 337

Game Changing Design.

Shlomit Weiss, VP Data Center Group/GM NPG Silicon Engineering

Intel

System On Chip (SOC) like the 6th generation core that was named by Intel CEO ” The best Processor Ever” brings huge goodness to many different segments and at the same time a huge design challenge. In today’s market environment we are getting more and more conflicting requirements to be satisfied by the same product. Some examples to those conflicting requirements are: high performance but low power, high complexity but short execution schedule, additional functionality but low cost and more. The lecture will talk about the large challenges from a technical and managerial perspective as well as methods that were developed to deliver those complex SOC.

Bio:
Shlomit Weiss is vice president in the Data Center Group and general manager of the Network Platforms Group (NPG) Silicon Engineering at Intel Corporation. In this role, Weiss manages a global team of hundreds of engineers cross Israel, Europe and the United States and is responsible for all silicon hardware for networking IPs and discrete devices.
In her previous role, Weiss was the general manager for R and D for Client products at Intel. Between other client products within her group, she was also responsible for the execution and delivery of the 6th Generation Core processor, which was described by Intel’s CEO as the “best processor ever.” Additionally, Weiss led the organization that was responsible for the design and delivery of the 2nd Generation Core processor.
A 26-year veteran of Intel, Weiss is based in Israel and has spent one-third of her Intel career in U.S. positions. During her tenure at Intel, Weiss has worked on a variety of projects, including validation, design and architecture. She received an Intel Achievement Award, the company’s highest recognition, for her contributions to the Intel Core 2 Duo architecture.
Weiss holds a master’s degree cum laude in electrical engineering from the Technion in Israel.

25/5/2016 11:30Taub 337

Imperfection is Beautiful and Efficient: Approximate Computing from Language to Hardware, and Beyond.

Luis Ceze

University of Washington

A significant proportion of computer system resources are devoted to applications that can inherently tolerate inaccuracies in their data, execution and communication. Hence, “approximate computing” is promising for performance and energy efficiency. However, taking advantage of approximate computing needs: language support to specify where and how to apply approximation; analysis and mechanisms that ensure good output quality; and hardware/system support that take advantage of approximation.
In this talk I will describe our effort on co-designing language, hardware and system support to take advantage of approximate computing across the system stack (compute, storage and communication) in a safe and efficient way. I will end with some thoughts on how effective approximate computing techniques can not only improve comp uter systems in the short and medium term but can also enable the use of new substrates and applications.

Bio:
Luis Ceze is the Torode Family Associate Professor in the Computer Science and Engineering Department at the University of Washington. His research is on the intersection of computer architecture, programming languages, OS and biology. He is a recipient of an NSF CAREER Award, a Sloan Research Fellowship, a Microsoft Research Faculty Fellowship and the 2013 IEEE TCCA Young Computer Architect Award. He was born and raised in Sao Paulo, Brazil, where it drizzles all the time; he now (mostly) lives in the similarly drizzly Seattle. When he is not working he is found either eating or cooking.

18/5/2016 11:30Meyer 861

A Brief History of Time in Software Defined Networks

Tal Mizrahi

technion

The emerging trend of Software Defined Networks (SDN) has been enthusiastically explored by researchers over the last decade, as it provides the flexibility and agility to update networks dynamically, in a way that efficiently utilizes the network resources.
In this talk we explore the use of accurate time and synchronized clocks as a tool for coordinating network updates in SDNs. We discuss key use cases in which using time introduces a significant advantage compared to previously known network update approaches. A practical approach to using accurate time in SDN is presented, and tools providing an efficient accurate scheduling method are proposed. This work also defines extensions to standard network protocols, enabling practical implementations of our concepts. Accurate time is shown to be a powerful feature that can be leveraged by various diverse SDN applications.

Bio:

4/5/2016 11:30Meyer 861

IoT-Enabled Community Care Provisioning for Sustainable Ageing-in-Place: A Singapore Example

Tan Hwee Pink

School of Information Systems, Singapore Management University

In 2014, 12.4% of the population in Singapore were above 65 years of age and this is projected to increase to 19% by 2030. Among them, those living alone is likely to increase to 83,000 by 2030, up from 35,000 today. The ability to “age in place” – living where you have lived for years with reasonable quality of life, is especially important for the latter group.
While Internet of Things (IoT)-enabled ambient intelligence environments that allow caregivers to remotely monitor a loved one’s activities 24/7 are emerging, most of the above systems are technology-centric, operate in silos and do not tie in with end-to-end care provisioning. Moreover, the elderly community exhibit huge variations in their living patterns and behaviour and a one-size-fits-all system will probably not work for all.
In this presentation, I will talk about SHINESeniors, an SMU-initiated effort to tackle the above issues through the integration of ambient intelligence with care provisioning, and the personalization of such systems. This research project, supported by the Ministry of National Development and National Research Foundation under the Land and Liveability National Innovation Challenge (L2NIC) funding, is a collaborative effort with A*STAR, Eastern Health Alliance, a voluntary welfare organization, GoodLife!, Tata Consultancy Services, Ministry of Health, Housing and Development Board, and Urban Redevelopment Authority in Singapore.

Bio:
Dr. Hwee-Pink TAN currently leads a team of 10 technology and social science researchers to bring together Internet of Things technologies, and social-behavioural research to enable and sustain ageing-in-place, leading, in a broader sense, to intelligent and inclusive societies.
Prior to joining SMU in March 2015, he spent 7 years at the Institute for Infocomm Research (I2R), A*STAR, where he was a Senior Scientist and concurrently the SERC Programme Manager for the A*STAR Sense and Sense-abilities Program. In this programme, he led a team of 30 full-time research scientists and engineers to design, pilot and evaluate architectures to support large scale and heterogeneous sensor systems to enable Smart City applications. In recognition of his contributions, he was awarded the I2R Role Model Award in 2012 and 2013, and the A*STAR Most Inspiring Mentor award, TALENT and Borderless Award in 2014.
He graduated from the Technion, Israel Institute of Technology, Israel in August 2004 with a Ph.D. In December 2004, he was awarded the A*STAR International Postdoctoral Fellowship. From December 2004 to June 2006, he was a post-doctoral research at EURANDOM, Eindhoven University of Technology, The Netherlands. He was a research fellow with The Telecommunications Research Centre (CTVR), Trinity College Dublin, Ireland between July 2006 and March 2008.
He is a Senior Member of the IEEE, has published more than 100 papers, has served on executive roles for various conferences on wireless sensor networks, and is an Area Editor of the Elsevier Journal of Computer Networks. He was Deputy Chair for the ITSC IoT Committee between July 2014 and March 2015. Lastly, he also recently co-founded and chairs the technology and innovation committee of the Stroke Support Station, a registered charity with a focused mission to help Stroke Survivors re-learn and enjoy active living for a better quality of life.

20/4/2016 13:30Meyer 861

HDR – EnHAnts – Routing in Energy Harvesting Networks

Adrian Segall

Technion and Bar Ilan University

EnHAnts – Energy Harvesting Networked Tag Networks are networks with
nodes that harvest energy from solar or movement cells. In this work we
present HDR – Hysteresis Driven Routing, the first algorithm for routing
messages in such networks. In addition to honoring Philip Merlin’s
memory, the aim of this talk is to get Technion graduate students interested
in this novel thesis topic.

Work performed in collaboration with Alex Lavzin, Sapir Erlich and Michal
Yomtovian.

Bio:

13/4/2016 11:30Meyer 861

Pricing Complexity

Noam Nisan

Hebrew University

As economic systems “move” to the Internet, they can become much more complex and this new complexity often becomes their defining characteristic. We will consider a very simple scenario of this form: a single seller that is selling multiple items to a single buyer. We will discuss the question of how *complex* must the pricing scheme be in order for the seller to maximize (approximately, at least) his revenue.
Based on joint works with Sergiu Hart, with Shaddin Duhgmi and Li Han and with Moshe Babioff and Yannai Gonczarowski.

Bio:
Noam Nisan is a professor of Computer Science and a member of the center of
Rationality at the Hebrew University of Jerusalem. His research deals with the border of
Computer Science, Economic Theory, and Game Theory, focusing on Electronic Markets and Auctions.

6/4/2016 11:30Taub 401

Near Optimal Placement of Virtual Network Functions

Seffi Naor

Technion

Network Function Virtualization (NFV) is a new networking paradigm where network functions are executed on commodity servers located in small cloud nodes distributed across the network, and where software defined mechanisms are used to control the network flows. This paradigm is a major turning point in the evolution of networking, and it introduces high expectations for enhanced economical network services, as well as major technical challenges. In this talk we address one of the main technical challenges in this domain: the actual placement of the virtual functions within the physical network. This placement has a critical impact on the performance of the network, as well as on its reliability and operation cost. We perform a thorough study of the NFV location problem, show that it introduces a new type of optimization problems, and provide near optimal approximation algorithms guaranteeing a placement with theoretically proven performance. The performance of the solution is evaluated with respect to two measures: the distance cost between the clients and the virtual functions by which they are served, as well as the setup costs of these functions. Our main result is bicriteria solutions reaching constant approximation factors with respect to the overall performance, and violating the capacity constraints of the networking infrastructure by only a constant factor as well.

Joint work with Rami Cohen, Liane Lewin-Eytan, and Danny Raz.

Bio:
Seffi Naor received his Ph.D. in computer science from the Hebrew University of Jerusalem. He was a post-doc at the University of Southern California and Stanford University. He is currently a professor of computer science at the Technion – Israel Institute of Technology, where he has been on the faculty since 1991. Seffi Naor’s research interests span a wide spectrum of topics in the design and analysis of efficient algorithms, in particular approximation algorithms for NP-Hard algorithms and on-line algorithms, network algorithmics, algorithmic game theory, and randomization. He is a frequent visiting scientist at Industrial research labs, in particular,
Microsoft Research, Bell Labs, and IBM T. J. Watson. During 1998-2000 he was a member of the technical staff at Bell Labs, and during 2005-2007 he was a visiting researcher at Microsoft Research. Seffi Naor has published over 100 papers in top professional journals and conferences. He is currently on the editorial board of Algorithmica and TALG, and he has been on the program committee of numerous conferences. He has won several awards including the Bergman award, 2007 Pat Goldberg Memorial best paper award given by IBM Research, and the FOCS 2011 best paper award.

30/3/2016 11:30Taub 401

TinyLFU: A Highly Efficient Cache Admission Policy

Roy Friedman

Technion

In this talk, I introduce a frequency based cache admission policy in order to boost the effectiveness of caches subject to skewed access distributions. Given a newly accessed item and an eviction candidate from the cache, our scheme decides, based on the recent access history, whether it is worth admitting the new item into the cache at the expense of the eviction candidate.
Realizing this concept is enabled through a novel approximate LFU structure called TinyLFU, which maintains an approximate representation of the access frequency of a large sample of recently accessed items. TinyLFU is very compact and light-weight as it builds upon Bloom filter theory.
I will present a study of the properties of TinyLFU through simulations of both synthetic workloads as well as multiple real traces from several sources. These simulations exhibit the performance boost obtained by enhancing various replacement policies with the TinyLFU eviction policy. Also, a new combined replacement and eviction policy scheme nicknamed W-TinyLFU will be prese nted. W-TinyLFU will be demonstrated to obtain equal or better hit-ratios than other state of the art replacement policies (including ARC and LIRS) on these traces. It is the only scheme to obtain such good results on all traces.
An open source implementation of TinyLFU and W-TinyLFU is available as part of the Caffeine Java 8 high performance caching library.
* Joint work Gil Einziger and Ben Manes

Bio:
Roy Friedman is an associate professor in the department of Computer Science at the Technion – Israel Institute of Technology. His research interests include Distributed Systems and Networking with emphasis on Fault-Tolerance, Consistency, Caching and Mobile Computing. Roy Friedman serves as associate editor for IEEE TDSC and has served as PC co-chair for ACM DEBS, ACM SYSTOR and Autonomics, as well as vice chair for IEEE ICDCS and EuroPar. Formerly, Roy Friedman was an academic specialist at INRIA (France) and a researcher at Cornell University (USA). He is a founder of PolyServe Inc. (acquired by HP) and holds a Ph.D. and a B.Sc. from the Technion.

23/3/2016 11:30Meyer 861

High speed I/O ports for Client Compute devices

Shahaf Kieseltein

Intel

The high speed I/O market in client platforms is going through major changes in the last few years, with latest silicon process that enable high bit rate on the wire while using low power to drive those bits. Where are those trends leading ? what is the impact on the user ? In this lecture we will review the changes in the USB and Thunderbolt technologies over the last few years, and will look forward to estimate where we think the latest developments will land, and what new use cases will be possible as a result of it.

Bio:
Shahaf Kieselstein, Vice President in the Client Computing Group and General Manager of the Client Connectivity Division at Intel Corporation. Responsible for Thunderbolt technology product lines and Go to Markets activities?

16/3/2016 11:30CS, Taub 401

Technology Considerations in Computer Architecture

Prof. Jean Luc Gaudiot

University of California

Good engineering practice uses the characteristics of existing technologies to optimize implementation. Often, this will mean that design techniques optimal in a previous generation prove impractical or even unusable when a new technology becomes dominant. This rule is all too often forgotten, which we will demonstrate in two problems of computer design: Field-Programmable Gate Arrays (FPGA) and hardware prefetchers (providing the ability to fetch data early in anticipation of the need). FPGAs are extremely useful in mobile embedded systems where computing power and energy considerations are major concerns. Partial reconfiguration is often used to reduce power consumption when parts of the array are inactive, albeit at the cost of high energy overhead due to the large cost of transferring configuration information. Our study reveals that partial reconfiguration accelerates execution and reduces overall energy consumption by half. Second, we will demonstrate how increased transistor integration allows hardware prefetching to improve both energy-efficiency and performance.

Bio:
Jean Luc Gaudiot received the engineering degree from ESIEE, Paris, France in 1976 and the M.S. and Ph.D. degrees in Computer Science from UCLA in 1977 and 1982, respectively. He is currently Professor in the Electrical Engineering and Computer Science Department at UC, Irvine. Prior to joining UCI in 2002, he was Professor of Electrical Engineering at the University of Southern California since 1982. His research interests include multithreaded architectures, fault-tolerant multiprocessors, and implementation of reconfigurable architectures. He has published over 250 journal and conference papers. His research has been sponsored by NSF, DoE, and DARPA, as well as a number of industrial companies. He has served the community in various positions and was just elected to the presidency of the IEEE Computer Society for 2017.

9/3/2016 11:30CS, Taub 337

Enabling Breakthroughs in Parkinson’s Disease with Wearables and Big Data Analytics

Shahar Cohen

Intel@technion

Parkinson’s Disease (PD) is a progressive, degenerative disorder of the central nervous system. It is characterized by significant motor symptom, such as: tremor, slowness of movement, and gait deficiencies; but also entails non-motor symptoms, such as: depression, cognitive slowness and sleep difficulties.
Intel Corporation is running a joint project with the Michael J. Fox foundation for Parkinson’s research, to promote research on PD, as well as patients’ daily care. As part of this project, sensory data from wearable devices is collected through a massive health-IoT platform that was developed by Intel. The sensorial data is then analyzed with Machine Learning and Digital Signal Processing algorithms, and objective Parkinsonian measures are extracted.
In this talk we will introduce the promise as well as the algorithmic challenges on the way for enabling breakthroughs in PD through the usage of wearable devices and big data analytics

Bio:
Shahar Cohen is a data scientist and product visionary and strategist at Intel. Currently, he helps in building a vision for the Intel and Michael J. Fox Foundation joint venture for enabling breakthroughs in research on Parkinson’s disease through big data analytics; and developing algorithms to support and materialize that vision.
Before joining Intel, Shahar gathered entrepreneurship experience as a co-founder and CTO of Optimove.com, a customer retention automation platform. He has also served as a machine learning and product consultant to several Israeli startups.
Shahar has over 15 years of experience in machine learning, data mining, and decision support systems, as a hands-on algorithms developer as well as entrepreneur and strategist

10/2/2016 11:30CS, Taub 337

On the way to visual understanding…

Dr. Gershom Kutliroff

Intel@technion

In the past several years, the field of computer vision has chalked up significant achievements, fueled by new algorithms (such as deep neural networks), new hardware (such as consumer 3D cameras) and new available processing (such as GPU’s). When we consider the problems that tomorrow’s household robots and autonomous vehicles will have to solve, however, there is evidently still a ways to go. In this talk, I will discuss current work within Intel’s Perceptual Computing on a scene understanding pipeline, based on 3D camera data. The aim of this work is to enable a far deeper understanding of an environment than existing techniques currently provide

Bio:

Dr. Gershom Kutliroff. Over the last 20 years, Gershom has held multiple positions in the computer vision and graphics industries, leading R and D efforts in the areas of human-computer interaction and visual understanding, and is an inventor on over 20 patents and patents pending. He was the CTO and co-founder of Omek Interactive, which developed hand tracking and gesture control technology, and was acquired by Intel in 2013. Dr. Kutliroff continues to pursue his interest in computer vision as a Principal Engineer within Intel’s Perceptual Computing group. He earned his Ph. D. and M. Sc. in Applied Mathematics from Brown University, and his B. Sc. in Applied Mathematics from Columbia University. Dr. Kutliroff is married with five children, but he still holds out hope to one day hike the Appalachian Trail

27/1/2016 11:30EE, Meyer 1061

Cloud Computing – The Beginning Or The End?

Nava Levy

Intel@technion

Cloud Computing is considered to be one of the most important paradigm shifts of this century. In this presentation we will explain what cloud computing is all about, review the key drivers and inhibitors of cloud and what is fueling its exponential growth. We will examine its effects across the ecosystem and megatrends as Mobile and Internet Of Things, including the threats and the opportunities. We will also try to determine where we are in this shift – is the end in sight or is it just the beginning ? Finally, we will discuss as some of its implications to companies, entrepreneurs and academy

Bio:
Nava Levy joined Intel to the Strategic Technologies Group of Intel Labs where she is responsible for identifying and incubating long-lead technologies that are disruptive or transformational for the domain of Cloud and Big Data. Nava brings with her over 20 years of experience in Hi-Tech as well as almost a decade in Cloud and Big Data domains in a variety of roles. Most recently Nava founded LerGO (www.lergo.org.il) , a cloud based nonprofit venture dedicated to kids’ education. Prior to that Nava led cVidya’s efforts in SaaS and Big Data as VP Cloud Solutions and before that she was the head of Amdocs’ SaaS/Cloud program

20/1/2016 11:30CS, Taub 401

The Fascinating Structure of Planar Graphs

Oren Weimann

Haifa University

Graph optimization problems are the most studied problems in theoretical computer science. These problems are not only mathematically intriguing. They have a crucial impact on our increasingly-computerized environment. Among the tractable optimization problems, the most fundamental are shortest paths, maximum flow, minimum cut, maximum matching, and minimum spanning tree. On planar graphs, these problems are all inter-related through fascinating connections. By exploiting these connections, and the remarkable wealth of structural properties exhibited by planar graphs, these problems can all be solved on planar graphs faster than on general graphs.

Research on planar graphs today is in an auspicious position where theoretical and practical goals are well aligned. On the theoretical side, research on planar graphs has produced over the last few years a variety of surprising and deep new ideas and techniques. On the practical side, planar graphs are useful for modeling real-world scenarios such as road networks, VLSI layouts, medical imagery, and satellite imagery. Presently there is a growing need for faster algorithms, driven by a dramatic increase in both the number of instances and their sizes. For example, GPS navigation and image processing induce planar graph optimization problems and are ever-more popular in a world with more than a billion smartphones.

In the talk, I will give a high level overview on the connections between shortest paths and various other optimization problems in planar graphs. In particular, I will show how many classical optimization problems can be solved on planar graphs in nearly linear time.

Bio:

Oren is a faculty member in the Computer Science department at the University of Haifa. He received his Ph.D. in 2009 from the Massachusetts Institute of Technology under the guidance of Prof. Erik Demaine. Oren’s research is about the design and analysis of algorithms and data structures for combinatorial optimization problems. Mostly, Oren is interested in optimization problems arising in pattern matching and in planar graphs and involving shortest paths.
The talk will be based on a graduate course on planar graphs that Oren teaches regularly at the University of Haifa.

13/1/2016 11:30CS, Taub 337

Network Measurement in the World of Crowd-Sourcing

Scott Kirkpatrick

The Hebrew University of Jerusalem

Traditionally, network measurement takes a small-data approach. Data is expensive, must be gathered unobtrusively, validated carefully, and used to address sharply-defined problems, if we are to obtain reliable answers. The undeniable existence of BigData in and around telephone and data networks (which have merged years ago ) and the presence of tools for learning from unstructured masses of data are changing this. A recent EU tender has called for a “crowdsourcing” approach to characterizing Internet performance at its edge in all European countries, with protocols for gathering the data needed from many sources, archiving it for wider use, and providing privacy guidelines, while making all of this web-visible to all citizens.

The types of data and measurement systems involved in traditional passive and active network measurements have evolved as new means of observation emerge. The forces driving the evolution are much cheaper hardware for measurements at the edge, making ubiquitous measurement possible, and new customers for this information. Operators and researchers are now joined by end users and regulators, who all want to know if they are getting the service and performance that they are paying for, and if not.?

Really big data sets are now available from telephone call records (CDRs and MCDRs) from every conversation, from cellphone sensors that monitor location and signal quality, and from smart phones, which log in addition all the applications that users choose to run and the associated data traffic volumes. I will show examples of the information that can be gleaned from a million or more smartphones. Besides monitoring all sorts of carrier and customer uses of new technologies, it is possible to make visible the internet’s interior from its edge. Thus we can assess network neutrality and pinpoint sources of performance issues.

Bio:

Scott Kirkpatrick is Professor of Engineering and Computer Science at the Hebrew University, Jerusalem, Israel. His research in recent years has focused on understanding the growth of living engineering organisms such as the Internet and especially on the technical and human aspects of the huge space at the edges of the Internet. He was a researcher and manager in physics and computer science at the IBM TJ Watson Research Center, leading projects in new technologies for personal computing, In 2000 he moved to his present position. At IBM he led a team which prototyped and then shipped IBM’s first Thinkpad, a pen-driven tablet computer. As a physicist, his work on percolation and spin glasses led to simulated annealing, an optimization framework widely employed in automated computer design.

11/1/2016 11:30CS, Taub 337

Cartoon Physics

Rob Cook (TCE Guest Talk)

Pixar

Pixar’s animated films are created using computer graphics, so the characters are constructed and animated in a virtual 3D world. Manipulating that world involves using physics and math for everything from sculpting the shapes of objects to animating the characters to lighting the scenes to texturing the surfaces to simulating the motion of clothes and hair. And although cartoon physics is based on classical physics, it has to adapt to the wacky things animators do. It has be responsive to artistic control and do things that aren’t physically possible while still looking physically plausible.

Bio:

Rob Cook was the co-architect and primary author of Pixar’s RenderMan software, which creates photo-realistic computer images. In 2001, he and two colleagues received Oscars for their contributions, the first ever given for software. RenderMan was used for 23 of the last 25 films to win an Visual Effects Academy Award. He has a BS in physics from Duke University and a MS in computer science from Cornell University. At Cornell, he worked on simulating realistic surfaces, taking computer-generated images beyond the limited plastic look they had at the time. In 1981, he joined Lucasfilm/Pixar, where he developed the first programmable shader; programmable shading is now an essential part of GPUs and game engines as well as high-end renderers. He was the first to use Monte Carlo techniques in computer graphics, which was essential for simulation of complex, realistic lights and camera effects. The latter proved particularly important in the special effects industry, because it allowed computer-generated imagery to match the motion blur and depth of field of the live-action footage with which it was combined. In 1987, he received the ACM SIGGRAPH Computer Graphics Achievement Award in recognition of these contributions, and in 2009, he received the ACM SIGGRAPH Stephen A. Coons Award for his lifetime contributions to the field. In 1999, he was inducted as a Fellow of the Association for Computing Machinery. He is one of only two people ever named to both the Academy of Motion Picture Arts and Sciences and the National Academy of Engineering.

6/1/2016 11:30CS, Taub 337

AVX-512: new opportunities and challenges for compilers and programmers

Ayal Zaks

Intel@technion

The next generation of Intel’s multicore and many-core product lines will use AVX-512, the biggest extension to Intel Instruction Set Architecture (ISA). This extension will provide new HW capabilities, yet poses a critical challenge for SW – how can modern compilers and programmers make efficient use of these new capabilities? This talk will discuss recent research and development advancements in coping with this challenge, in terms of innovative compilation technology, programming models, and the interaction between the two.

Bio:

Ayal Zaks joined Intel’s Software and Services Group in Haifa late in 2011 where he manages a compiler development team. Prior to that Ayal spent 15 years at the IBM Haifa Research Laboratory where he worked on compiler optimizations and managed its compiler technologies group. In parallel, Ayal has been active academically, working with research students, serving on program committees, publishing nearly fifty papers and co-organizing international workshops. He received B.Sc., M.Sc., and Ph.D. degrees in Mathematics and Operations Research from Tel-Aviv University, and is an adjunct lecturer on Compilation at the Technion. In recent years, he is a parental fan of FIRST robotics competitions.

30/12/2015 11:30EE, Meyer 861

Schemes for Network Survivability in QoS-Supporting Architectures

Jose Yallouz

Israel Institute of Technology

Coping with network failures has been recognized as an issue of major importance in terms of security, stability and prosperity. It has become clear that current networking standards fall short of coping with the complex challenge of surviving failures. The need to address this challenge has become a focal point of networking research. Accordingly, the goal of this research is to establish a comprehensive methodology for efficiently providing network survivability in conjunction with other major QoS requirements. This novel methodology should reflect a multi-perspective consideration, including theoretical aspects as well as the development of new QoS metrics.

In this talk, we will mainly focus on the concept of tunable survivability, which offers major performance improvements over traditional approaches. Indeed, while the traditional approach is to provide full (100%) protection against network failures through disjoint paths, it was realized that this requirement is too restrictive in practice. Tunable survivability provides a quantitative measure for specifying the desired level (0%-100%) of survivability and offers flexibility in the choice of the routing paths in unicast and broadcast transmission methods. For unicast, we establish efficient algorithmic schemes for optimizing the level of survivability under additive end-to-end QoS bounds. Moreover, we establish some (in part, counter-intuitive) properties of the optimal solution. For broadcast, we investigate the application of tunable survivability for the maintenance of spanning trees under the presence of failures and establish efficient algorithmic schemes for optimizing the level of survivability under various QoS requirements. In addition, we derive theoretical bounds on the number of required trees for maximum survivability. Finally, through extensive simulations, we demonstrate the effectiveness of the tunable survivability concept for both methods.

If time permits, we will also cover the following topics studied during this thesis:

  • A study which investigates the gap between the optimal disjoint-node path pair and the disjoint-link path pair by relaxing the classical “full disjointness” requirement.
  • Design aspects of Stackable Routers, i.e. a class of independent routing units operating together as a single router, and the analysis of schemes for improving their performance.
  • A study focusing on the number of shortest paths that can be covered by a single spanning tree.

Bio:

Jose Yallouz is a Ph.D. candidate in the Electrical Engineering department of the Technion, Haifa, Israel. He received the B.Sc. from the same department in 2008. He is a recipient of the Israel Ministry of Science Fellowship in Cyber and advance Computing award. Among his activities are the organization of the CeClub seminar and the Netmeeting forum. He is mainly interested in computer networks, algorithm design, survivability, QoS, SDN and NFV architectures.

28/12/2015 11:30EE, Meyer 1007

Game theory applied to Protection against SIS Epidemics in Networks

Eitan Altman

French Institute for Research in Computer Science and Automation

Defining an optimal protection strategy against viruses, spam propagation or any other kind of contamination process is an important feature for designing new networks and architectures. In this work, we consider decentralized optimal protection strategies when a virus is propagating over a network through a SIS epidemic process. We assume that each node in the network can fully protect itself from infection at a constant cost, or the node can use recovery software, once it is infected. We model our system using a congestion game theoretic framework (due to Rosenthal)
and find pure, mixed equilibria, and the Price of Anarchy (PoA) in several network topologies. We finally propose extensions
of both the SIS propagation model as well as of the protection strategies and evaluate their performance. This work was done within the CONGAS European project on games in complex systems with the authors Stojan Trajanovski, Yezekael Hayel, Huijuan Wang and Piet Van Mieghem.

Bio:

Eitan Altman received the B.Sc. degree in electrical engineering (1984), the B.A. degree in physics (1984) and the Ph.D. degree in electrical engineering (1990), all from the Technion-Israel Institute, Haifa. In (1990) he further received his B.Mus. degree in music composition in Tel-Aviv university. Since 1990, Dr. Altman has been a researcher at INRIA (National research institute in computer science and control) in Sophia-Antipolis, France. He has been in the editorial boards of several scientific journals: Wireless Networks (WINET), Computer Networks (COMNET), Computer Communications (Comcom), J. Discrete Event Dynamic Systems (JDEDS), SIAM J. of Control and Optimisation (SICON), Stochastic Models, and Journal of Economy Dynamic and Control (JEDC). He received the best paper award in the Networking 2006, in Globecom 2007, in IFIP Wireless Days 2009 and in CNSM 2011 (Paris) conferences. His areas of interest include Network Engineering Games, social networks and their control and the analysis through game theoretical models of network neutrality issues. He received in 2012 the Grand Prix de France Telecom from the French Academy of Sciences. On July 2014 he received the ISAACS Award from the Society of Dynamic Games for his contribution in game Theory. More information can be found at www-sop.inria.fr/members/Eitan.Altman/

23/12/2015 11:30EE, Meyer 861

Challenges in Blockchain Protocols

Ittay Eyal

Cornell University

The study of security in decentralized large-scale distributed systems has made a large leap forward with the introduction of cyber-currencies, starting with Bitcoin. This talk will discuss three results tackling central challenges that were exposed by this progress.

Cyber-currencies, as the name implies, implement secure and reliable digital currencies, but their promise has grown beyond currency, for example to so-called smart contracts and to digital asset infrastructure. Their success is mainly due to the introduction of Bitcoin’s blockchain protocol, which achieves a variant of consensus, secured by a proof-of-work incentive mechanism.

I will show that the bound on the resilience of blockchain protocols is significantly worse than previously thought. This is due to the selfish mining attack, which is effective against all blockchain protocols known to date. Next, I’ll show that security does not stand at odds with performance at the adversarial environments where blockchain algorithms live. I’ll present a novel blockchain protocol that achieves both security and optimal performance, demonstrated in the largest blockchain lab-experiment to date. Finally, I’ll address an issue that is general to the proof-of-work incentive mechanism, namely that participants tend to form coalitions, leading to centralization. I’ll present a game-theoretic analysis of a novel attack among such coalitions, which leads to surprising consequences.

Bio:

Ittay Eyal is a post-doctoral associate in Cornell university. His research focuses on the security and the performance of large-scale distributed systems, particularly cyber-currencies. He completed his Ph.D. on distributed systems in 2013 in the Technion’s Electrical Engineering Department.

16/12/2015 11:30EE, Meyer 861

Improving parallel programs with architectural insights

Adam Morrison

Technion

Developers and researchers design parallel systems and concurrent
algorithms using models—both formal and mental—that abstract away
details of the underlying hardware. In practice, however, these
hidden hardware/software interactions often have a dominating impact
on performance. In this talk, I will show how understanding the
interplay between hardware and software can lead to more efficient
program execution and better algorithms, which are achieved through
more precise models and architectural changes:

(1) I will show how understanding the implementation of the x86
processors’ total store ordering (TSO) memory model enables designing
a fence-free work stealing algorithm, which was thought to be
impossible.

(2) I will describe how insights on the way software uses atomic
memory operations lead to a novel hardware architecture design, which
parallelizes the execution of certain atomic synchronization
instructions instead of serializing them.

Bio:

Adam Morrison is a Post Doctoral Fellow at the Technion—Israel
Institute of Technology. His research focuses on building efficient
infrastructure for multicore architectures. He completed a PhD in CS
at Tel Aviv University, during which he was awarded an IBM PhD
Fellowship as well as Intel and Deutsch prizes.

25/11/2015 11:30EE, Meyer 861

Infrastructure sharing in cellular networks

Francesco Malandrino

Hebrew University

The talk summarizes the main challenges and opportunities of
infrastructure sharing in present and future cellular networks. We begin
by assessing, through real-world demand and deployment traces, how
sharing can improve the efficiency of present-day networks, especially
in rural areas. We then move to next-generation networks, exploring the
relationship (and the potential conflict) between sharing and
competition regulations. Finally, we study the relationship between
traditional and virtual operators sharing the same SDN-based,
virtualized core network.

Bio:
Francesco Malandrino is currently a Fibonacci Fellow at the Hebrew
University of Jerusalem. Before his current appointment, he was a
post-doc at CTVR/Trinity College, Dublin and Politecnico di Torino,
Italy, where he obtained his Ph.D. His interests focus on mobile
networks with infrastructure.

18/11/2015 11:30EE, Meyer 861

Forwarding in Computer Cluster Networks

Eitan Zahavi

Israel Technology Institute

Large Computer Clusters are the infrastructure behind The Cloud and Web2 services. These large clusters host highly parallel programs like distributed databases and MapReduce. The parallelism is required for handling the Big Data stored on these clusters. But applications with high degree of parallelism stress the cluster network as they tend to generate correlated traffic bursts of high bandwidth. This stress causes long network latency tail, as well as incast and throughput collapse. These are known issues of the cluster networks. These problems are intensified by the Cloud environment where multiple such parallel applications run concurrently on the same cluster. Not only that the applications suffer from their own bursts of high throughput traffic, they might be attacked by the other applications. These attacks may create variation in the obtained application performance and reduces runtime predictability.
In this talk, I will present my PhD research work on the problem of improving the computer cluster network performance for the correlated, bursty and high capacity traffic of parallel applications. My work has focused on optimizing the packet forwarding for Fat Tree topologies. I present multiple approaches I studied and my contributions. Then I focus on the improving application runtime predictability in the Cloud.

Bio:
Eitan Zahavi is a PhD student at the Faculty of Electrical Engineering in the Technion, under the supervision of Professors Avinoam Kolodny, Israel Cidon and Isaac Keslassy. He is also a Senior Principal Engineer in Mellanox Technologies. His research interests include theoretical and practical aspects of Data Center and High Performance Networks. Mostly in scheduling, traffic engineering and network management.

4/11/2015 11:30EE, Meyer 861

Towards an Inexpensive, Reliable and Intelligent Archival
Solid-State Drive

Yue Li

California Technology Institute

Archival data once written are rarely accessed by user, and need to be
reliably retained for long periods of time. The challenge of using
inexpensive NAND flash to archive cold data was posed recently for
saving data center costs. Solid state drives are faster, more
power-efficient and mechanically reliable than hard drives (HDs).
However, flash of high density is vulnerable to charge leakage over
time, and can only be cost-competitive to HD in archival systems if
longer retention periods (RPs) are achieved. Moreover, the size of
archival data grows exponentially each year, which makes finding the
data we need more difficult.

This talk describes two examples of our on-going research to address
the issues above. We first present the implementation of a coding
technique named rank modulation (RM). RM reads data using the relative
order of cell voltages, and is more resilient to retention errors. We
show that combining RM and memory scrubbing provides more than 100
years of retention period for 1x-nm triple-level cell NAND flash. We
then demonstrate an associative memory framework. The framework
utilizes the random-access capability of flash memory, and solves word
association puzzles with good precision and fast speed using
crowd-sourced data. We show the similarities between puzzle solving
and data retrieval, and discuss our plans on expanding the current
framework for more realistic data retrieval applications.

Bio:
Yue is a postdoctoral fellow at California Institute of
Technology. His research focuses on algorithms and data
representations for emerging non-volatile memories. Yue worked as a
research intern at LSI in 2013. He received Ph. D. in computer science
from Texas AM University in 2014.

28/10/2015 11:30EE, Meyer 1007

Pushing the Time Space Trade-off in Standard Compression

Danny Harnik

International Business Machines

In this talk I’ll present work on data compression that touches both on the theory of fast compression design and practical aspects as dictated by market and product requirements.
I will describe the background, the challenges and our solutions for pushing the time limits of standard compression.
No prior knowledge on compression will be assumed.

Bio:
Danny Harnik is a researcher in the cloud storage group at IBM Reseach – Haifa.
He holds a PhD from The Weizmann Institute of Science, and his main fields of research are storage systems, compression, security and cryptography.
Danny also spent 1.5 years as a post-doc at the Technion.

21/10/2015 13:30EE, Meyer 861

PowerSpy: Location Tracking using Mobile Device Power Analysis

Yan Michalevsky

Standford University

Modern mobile platforms like Android enable applications to read
aggregate power usage on the phone. This information is considered
harmless and reading it requires no user permission or notification.
We show that by simply reading the phone’s aggregate power consumption
over a period of a few minutes an application can learn information
about the user’s location. Aggregate phone power consumption data is
extremely noisy due to the multitude of components and applications
that simultaneously consume power. Nevertheless, by using machine
learning algorithms we are able to successfully infer the phone’s
location. We discuss several ways in which this privacy leak can be
remedied.

Bio:
Yan is a PhD student at Stanford University, advised by Prof. Dan Boneh. Working in the general areas of applied cryptography and
security, he has recently focused on mobile security and privacy.
His works on side-channel attacks on mobile devices were presented at Usenix Security and BlackHat security conferences.
Previously, he worked in industry as a team manager, independent consultant, and software architect and developer, mostly in the fields of networks, embedded software and security.
Yan holds a BSc in Electrical Engineering from the Technion, and an MS in Electrical Engineering from Stanford University.

8/7/2015 11:30EE, Meyer 861

Scalable and Efficient Multipath Routing: Complexity and Algorithms

Janos Tapolcai and Gabor Retvari

Budapest University of Technology and Economics

A fundamental unsolved challenge in multipath routing is to provide a
sufficient diversity of disjoint end-to-end paths, each one satisfying
certain operational goals (e.g., shortest possible), without
overwhelming the data plane with prohibitive amounts of forwarding
state. In the presentation, we study the problem of finding a pair of
shortest disjoint paths that can be represented by only two forwarding
table entries per destination. Building on prior work on minimum length
redundant trees, we show that the underlying mathematical problem is
NP-complete and we present heuristic algorithms that improve the known complexity bounds from cubic to the order of a single shortest path search. Finally, by extensive simulations we find that it is possible to
very closely attain the absolute optimal path length with our
algorithms (the gap is just 1-5%), eventually opening the door for
wide-scale multi path routing deployments.

Bio:
Janos Tapolcai received his M.Sc. (’00 in Technical Informatics), Ph.D. (’05 in Computer Science) degrees from Budapest University of Technology and Economics (BME), Budapest, and D.Sc. (’13 in Engineering Science) from Hungarian Academy of Sciences (MTA). Currently he is an Associate Professor at the High-Speed Networks Laboratory at the Department of Telecommunications and Media Informatics at BME. His research interests include applied mathematics, combinatorial optimization, optical networks and IP routing, addressing and survivability. He is an author of over 110 scientific publications, he is the recipient of the Google Faculty Award and Best Paper Award in ICC’06, in DRCN’11. He is a TPC member of leading conferences such as IEEE INFOCOM (2012 – 2014), and is a winner of MTA Momentum (Lendulet) Program.

Gabor Retvari received the M.Sc. and Ph.D. degrees in electrical engineering from the Budapest University of Technology and Economics (BME). He is now a Senior Research Fellow at the High Speed Networks Laboratory, BME. He has been passionate about the questions of routing scalability for a long time and he has done plenty of research in this field, using a diverse set of mathematical tools ranging from abstract algebra to computational geometry and, more recently, information-theory. He maintains numerous open source scientific tools written in Perl, C and Haskell.

1/7/2015 11:30EE, Meyer 1061

Adopting Software-Defined Networking: Challenges and Recent Developments

Kirill Kogan

IMDEA Networks Institute

Initially, SDN has been promoted as a simpler and more flexible way of managing networks than traditional approaches. However, it is unlikely that a single SDN controller vendor will have best-in-class implementation of all network services, so the coexistence of several managing controllers will be a mandatory feature of SDN network management in the near future. Similarly, there is a price for generality and expressiveness to be paid with efficiency of the data plane representations. We consider challenges from the above domains and emphasize the importance of new abstractions for both control and data planes to address heterogeneity of manageable network state.

Bio:
Kirill Kogan is currently a Research Assistant Professor at IMDEA Networks Institute. His research interests are related to admission control and buffer management, packet classification, software-defined networking, network functions visualization. Prior to joining IMDEA Networks Institute he worked as a Technical Leader at Cisco Systems and later a postdoctoral researcher at University of Waterloo and Purdue University. He holds a PhD in CSE from Ben-Gurion University.

http://people.networks.imdea.org/~kirill_kogan/index.html

24/6/2015 11:30CS, Taub 9

Finding Yourself is the Key – Biometric Key Derivation Which Keep Your Privacy

Orr Dunkelman

Haifa University

Biometric authentication is more secure than using regular passwords, as biometrics cannot be “forgotten” and allegedly contain high entropy. Thus, many constructions rely on biometric features for authentication, and use them as a source for “good” cryptographic keys. At the same time, biometric systems carry with them many privacy concerns. Unlike regular passwords, which can be easily changed if compromised, changing biometric traits is far from being easy. Hence, we need to protect the privacy of the system’s users in case of a leakage of the systems internal “password file”.

In this talk we describe a proof-of-concept (PoC) system which transforms facial attributes from a single image into keys in a consistent, discriminative, and privacy-aware manner. The outcome is a user-specific string that cannot be guessed, and it reveals no information concerning the users of the system, even when the system’s secrets are revealed.

This is a joint work with Margarita Osadchy and Mahmood Sharif.

Bio:
Orr Dunkelman is an associate professor in the Computer Science department at the University of Haifa. His research focuses on cryptanalysis, cryptography, security, and privacy. Orr’s work in symmetric-key cryptanalysis includes analyzing many ciphers and the introduction of several new cryptanalytic techniques. Orr has worked on many of the most widely deployed ciphers such as the AES, KASUMI (used in 3G mobile networks), A5/1 (used in GSM networks), and IDEA. He served in more than 60 conference committees, including CCS, Crypto and Eurocrypt, three times as a program chair (FSE 2009, CT-RSA 2012, SAC 2015), and has won several distinctions and awards (e.g., the Krill prize (2014), best paper awards in Crypto 2012 and FSE 2012).

Orr obtained his Ph.D. in computer science in 2006 from the Technion and a B.A. in computer science in 2000 from the Technion.

17/6/2015 11:30EE, Meyer 861

Unveiling the Secrets of High-Performance Datacenters

Michael Schapira

Hebrew University

Many architectures for high-performance datacenters have been proposed to meet the ever-growing traffic demands. Surprisingly, recent studies show that even random datacenter topologies outperform much more sophisticated designs, achieving near-optimal throughput and bisection bandwidth, high resiliency to failures, incremental expandability, high cost efficiency, and more. While this highlights the suboptimality of existing datacenters, the inherent unstructuredness and unpredictability of random datacenter topologies pose obstacles to their adoption in practice. Can these guarantees be achieved by well-structured, deterministic datacenter architectures? We provide a surprising affirmative answer to this question. We show, through a combination of theoretical analyses, extensive simulations, and experiments with e network emulator, that any network topology that belongs to the extensively studied class of “expander graphs” (as indeed do random graphs) comes with these benefits, and more, thus opening a new avenue for highperformance datacenter design: turning ideas from the rich literature on deterministically constructing expander graphs into an operational reality. We leverage these insights to propose Xpander, a novel deterministic datacenter architecture that achieves all of the above desiderata, and significantly outperforms traditional datacenter designs. We discuss challenges en route to deploying Xpander (e.g., cabling, routing) and explain how these can be resolved. Our results suggest that the key to future progress on designing high-performance datacenters lies in exploring design tradeoffs within the space of expander datacenters.

Joint work with Michael Dinitz (Johns Hopkins University) and Asaf Valadarsky (Hebrew University of Jerusalem)

Bio:
Michael Schapira (http://www.cs.huji.ac.il/~schapiram/) is a senior lecturer at the School of Computer Science and Engineering, the Hebrew University of Jerusalem. His research interests lie in the design and analysis of protocols and architectures for (Inter)networked environments (e.g., routing, congestion control, traffic engineering, and more). Prior to joining the Hebrew University he was a visiting scientist at Google NYC and a postdoctoral researcher at UC Berkeley, Yale University, and Princeton University. He is a recipient of the Allon Fellowship (2011), a Microsoft Research Faculty Fellowship (2013), the IETF/IRTF Applied Networking Research Prize (2014), the Hebrew University President’s Prize (2014), and the Krill Prize (2015). He holds a BSc in Math and Computer Science, a BA in Humanities, and a PhD in Computer Science from the Hebrew University of Jerusalem.

10/6/2015 11:30EE, Meyer 1007

TeamBoost: Enhanced Cloud Connectivity Through Collaborative MPTCP

Osnat (Ossi) Mokryn

Haifa University and Academic TLV

We present TeamBoost, a collaborative Multi-Path TCP (MPTCP) solution enabling a client device to leverage its nearby devices over a local network, i.e., WiFi. The client harnesses its nearby devices autonomous connectivity to deliver and relay TCP sub streams of traffic to and from the cloud.
The motivation for TeamBoost comes from the need to supply or enhance cloud connectivity to a vehicle infotainment system (the client device). This solution boosts end-to-end performance in terms of bandwidth, latency, and robustness by leveraging the cellular devices of the vehicle occupants.

TeamBoost’s approach is to build upon the MPTCP protocol. It splits a TCP stream into multiple TCP sub-streams that traverse different communication links between two end-points. The described scenario, however, does not lend itself to a straight forward use of MPTCP, and presents considerable architectural challenges. TeamBoost therefore presents a novel architecture that leverages MPTCP’s multi-stream managements capabilities over multiple devices.

TeamBoost is fully implemented and is deployed as a prototype. We further present our experimental results that show the benefits of harnessing multiple devices, as well as achieved robustness.

Bio:
Osnat (Ossi) Mokryn is a senior lecturer at the Academic College Tel Aviv Yaffo and a Researcher in the Information System Dept., University of Haifa. Ossi’s research interests are in accelerating content delivery to end users, as well as social media mining and social networks. Currently she serves as a member of the Technical Program Committee for the
ACM SigComm CoNext 2015. Previously she was a member of the Technical Program Committee of ACM SigComm IMC 2014.

3/6/2015 11:30CS 337

Processing Real-time Data Streams on GPU-Based Systems

Uri Verner

Technion

Real-time stream processing of Big Data has an increasing demand in modern data centers. There, a continuous torrent of data, created from different streaming data-sources like social networks, video streams, and financial markets, is being processed and analyzed to produce valuable insights about its content, and, in some cases, their value has an expiration date. For example, in a silicon-wafer production inspection system, dozens of high-resolution cameras scan the wafer and generate high-rate streams of images, where defects as small as a pixel and even smaller are detected using image processing algorithms. The inspection is a computationally intensive task that must adhere to strict processing time limitations in order to keep up with the production line. High-end computing platforms, packed with CPUs and compute accelerators, are being used to deal with the large processing need. Optimizing such systems is a very challenging task because they are heterogeneous in both computation and communication.

In this talk, I will present my PhD research work on the problem of processing multiple data streams with hard processing-latency bounds on multi-GPU compute nodes. The talk will describe the challenges in work distribution and communication scheduling in such a system, and present new methods that achieve higher utilization of the resources, while guaranteeing hard real-time compliance. The generalization of the previously discussed problems leads to an important new class of scheduling problems where processors affect each other’s speed. To demonstrate its usefulness, a problem from this class is used to develop an efficient new scheduler that minimizes the makespan of compute jobs on Intel CPUs.

Bio:
Uri Verner is a PhD student at the Department of Computer Science in the Technion, under the supervision of Professors Assaf Schuster and Avi Mendelson. His research interests include theoretical and practical aspects of data processing in GPU-based systems, computation and communication scheduling, system modeling and optimization, machine learning, and parallel computing.

This talk doubles as a PhD seminar lecture.

27/5/2015 11:30EE, Meyer 861

Coded Retransmission in Wireless Networks Via Abstract MDPs

Mark Shifrin

Ben Gurion University

We consider a transmission scheme with a single transmitter and multiple receivers over a faulty broadcast channel. For each receiver, the transmitter has a unique infinite stream of packets, and its goal is to deliver them at the highest throughput possible. While such multiple-unicast models are unsolved in general, several network coding based schemes were suggested. In such schemes, the transmitter can either send an uncoded packet, or a coded packet which is a function of a few packets. Sent packets can be received by the designated receiver (with some probability)
or heard and stored by other receivers. Two functional modes are considered; the first presumes that the storage time is unlimited, while in the second it is limited by a given Time To Live (TTL) parameter. We model the transmission process as an infinite horizon Markov Decision Process (MDP). Since the large state space renders exact solutions computationally impractical, we introduce policy restricted and induced MDPs with significantly reduced state space, which with properly chosen reward have equal optimal value function. We then derive a reinforcement learning algorithm, which approximates the optimal strategy and significantly improves over uncoded schemes. The algorithm adapts to the packet loss rates, unknown in advance, attains high gain over the uncoded setup and is comparable with the upper bound by Wang, derived for a much stronger coding scheme.

Bio:
Mark Shifrin received his PHD from EE, Technion, in 2014. He is recipient of Kreitman postdoctoral fellowship in Ben Gurion university, where he currently pursues topics in wireless communication using stochastic control methods, at department of Communication Systems Engineering.

25/5/2015 11:30CS 337

Thunderstrike: EFI firmware bootkits for Apple MacBooks

Trammell Hudson

Two Sigma

In this presentation we demonstrate the installation of persistent firmware modifications into the EFI boot ROM of Apple’s popular MacBooks. The bootkit can be easily installed by an evil-maid via the externally accessible Thunderbolt ports and can survive reinstallation of OSX as well as hard drive replacements. Once installed, it can prevent software attempts to remove it and could spread virally across air-gaps by infecting additional Thunderbolt devices.

Bio:
Trammell Hudson works at Two Sigma Investments on security, networking and distributed computation projects. Prior to coming to New York, he worked for many years at Sandia National Labs on message passing and operating systems for Top500 parallel supercomputers. More info:
http://twosigma.com and https://trmm.net/

20/5/2015 11:30EE, Meyer 1007

Ownership-Aware Software-Defined Backhauls in Next-Generation Cellular Networks

Francesco Malandrino

Hebrew University

Future cellular networks will be owned by multiple parties, e.g., two mobile operators, each of which controls some elements of the access and backhaul infrastructure. In this context, it is important that as much traffic as possible is processed by the same party that generates it, i.e., that the coupling between traffic and network ownership is maximized.

Software-defined backhaul networks can attain this goal; however, existing management schemes ignore ownership altogether. We fill this gap by presenting an ownership-aware network management scheme. Our simulations show that its performance is very close to the optimum, and substantially higher than the one of state-of-the-art alternatives

Bio:
Francesco Malandrino earned his Ph.D. from Politecnico di Torino, Italy and is currently a Fibonacci Fellow at the Hebrew University of Jerusalem. His research interests mainly focus on wireless networks with infrastructure, especially next-generation cellular networks.

13/5/2015 11:30EE, Meyer 861

Decentralizing Authorities into Scalable Strongest-Link Cothorities

Bryan Ford

Yale University

Online infrastructure often depends on security-critical authorities such as logging, time, and certificate services. Authorities, however, are vulnerable to the compromise of one or a few centralized hosts yielding “weakest-link” security. We propose collective authorities or cothorities, an architecture enabling thousands of participants to witness, validate, and co-sign an authority’s public actions, with moderate delays and costs. Hosts comprising a cothority form an efficient communication tree, in which each host validates log entries proposed by the root, and contributes to collective log-entry signatures. These collective signatures are small and efficient to verify, while embodying “strongest-link” trust aggregated over the collective. We present and evaluate a prototype cothority implementation supporting logging, time stamping, and public randomness (lottery) functions. We find that cothorities can scale to support over 4000 widely-distributed participants while keeping collective signing latencies to within a few seconds.

Bio:
Bryan Ford currently leads the Decentralized/Distributed Systems (DeDiS) research group at Yale University, but will be moving to EPFL in Lausanne, Switzerland in July 2015. Ford’s work focuses broadly on building secure systems, touching on many particular topics including secure and certified OS kernels, parallel and distributed computing, privacy-preserving technologies, and Internet architecture. He has received the Jay Lepreau Best Paper Award at OSDI, and multiple grants from NSF, DARPA, and ONR, including the NSF CAREER award. His pedagogical achievements include PIOS, the first OS course framework leading students through development of a working, native multiprocessor OS kernel. Prof. Ford earned his B.S. at the University of Utah and his Ph.D. at MIT, while researching topics including mobile device naming and routing, virtualization, microkernel architectures, and touching on programming languages and formal methods.

06/5/2015 11:30EE, Meyer 861

Secure Network Coding Schemes for Wireless and Data Storage Networks

Alex Sprintson

TAMU

Weakly secure codes aim to hide information about individual packets as well as small groups of packets from an eavesdropper that can intercept wireless transmissions or has access to a small number of storage nodes. Such codes are a practical alternative to traditional information-theoretic schemes that hide information about the entire set of files from the eavesdropper. The weakly secure codes do not use random keys, and as a result have better performance and lower overhead than the traditional schemes.

The talk will include two parts. First, we will present an algorithm for constructing weakly secure codes that enable clients to exchange data over a shared broadcast channel in the presence of an eavesdropper. We show that this problem has many interesting reformulations, such as designing constrained generator matrices of MDS codes and leads to interesting conjectures in algebraic geometry and abstract algebra. Second, we present an explicit construction of a coset coding based outer code to enhance the weak security properties of a regeneration code, i.e., a code which optimizes the repair bandwidth in a distributed storage system.

Bio:
Dr. Sprintson is an Associate Professor in the Department of Electrical and Computer Engineering, Texas AM University, College Station. From 2003 to 2005, he was a Postdoctoral Research Fellow at the California Institute of Technology, Pasadena. His research interests lie in the general area of communication networks with a focus on network coding and software defined networks. Dr. Sprintson received the Wolf Award for Distinguished Ph.D.students, the Viterbi Postdoctoral Fellowship, and the NSF CAREER award. Currently, he serves as an associate editor of the IEEE Transactions on Wireless Communications. He has been a member of the Technical Program Committee for the IEEE Infocom 2006-2016.

29/4/2015 11:30CS, Taub 9 classroom

Scaling Concurrent Log-Structured Data Stores

Dr. Eshcar Hillel

Yahoo Labs

Log-structured data stores (LSM-DSs) are widely accepted as the state-of-the-art implementation of key-value stores. They replace random disk writes with sequential I/O, by accumulating large batches of updates in an in-memory data structure and merging it with the on-disk store in the background. While LSM-DS implementations proved to be highly successful at masking the I/O bottleneck, scaling them up on multicore CPUs remains a challenge. This is nontrivial due to their often rich APIs, as well as the need to coordinate the RAM access with the background I/O.

We present cLSM, an algorithm for scalable concurrency in LSM-DS, which exploits multiprocessor-friendly data structures and non-blocking synchronization. cLSM supports a rich API, including consistent snapshot scans and general non-blocking read-modify-write operations. We implement cLSM based on the popular LevelDB key value store, and evaluate it using intensive synthetic workloads as well as ones from production web-serving applications. Our algorithm outperforms state of the art LSMDS implementations, improving throughput by 1.5x to 2.5x. Moreover, cLSM demonstrates superior scalability with the number of cores (successfully exploiting twice as many cores as the competition).

This is a joint work with Guy Golan, Eddie Bortnikov (Yahoo Labs) and Idit Keidar (Technion, Yahoo Labs), presented at EuroSys 2015.

Bio:
Eshcar Hillel is a Research Scientist at Yahoo Labs, currently working in the Scalable Search Systems Research team. Eshcar received the B.Sc. degree in Software Engineering and the Ph.D. degree in Computer Science from the Technion, in 2002 and 2011, respectively. Prior to her graduate studies she worked at Hewlett-Packard as part of the Non-Stop relational database system team.

27/4/2015 11:30EE, 861

Catapulting beyond Moore’s Law: Using FPGAs to Accelerate Data Centers

Derek Chiou

Microsoft

In this talk, I will describe a joint Microsoft Research and Bing project to study the possibility of using field programmable gate arrays to accelerate cloud applications. We developed an FPGA card that plugs into a Microsoft designed cloud server and used it to accelerate a significant portion of Bing’s search engine. Because the application does not fit into a single FPGA, we spread the application across multiple FPGAs spread across multiple servers and connected together with a low latency network also implemented by the FPGA. Our 1,632 server pilot demonstrated a factor of two performance improvement over pure software at significant power savings. FPGAs will accelerate Bing searches in one data center starting in 2015.

Bio:
Derek Chiou is a Principal Architect at Microsoft where he co-leads a team working on FPGAs for data center applications and an associate professor at The University of Texas at Austin. His research areas are FPGA acceleration, high performance computer simulation, rapid system design, computer architecture, parallel computing, Internet router architecture, and network processors. Before going to UT, Dr. Chiou was a system architect at Avici Systems, a manufacturer of terabit core routers. Dr. Chiou received his Ph.D., S.M. and S.B. degrees in Electrical Engineering and Computer Science from MIT.

15/4/2015 11:30CS, 337

How code regularity affects code complexity

Prof. Dror Feitelson

Hebrew University

It is naturally easier to comprehend simple code relative to complicated code. We suggest that code regularity – where the same structures are repeated time after time – may significantly reduce complexity, because once one figures out the basic repeated element it is easier to understand additional instances. We demonstrate this by controlled experiments where subjects perform cognitive tasks on different versions of the same basic function. The results also suggest a context-sensitive model of complexity, where syntactic constructs are weighted by their place in the sequence.

Joint work with Ahmad Jbara.

Bio:
Dror Feitelson is a professor of computer science at the Hebrew University of Jerusalem, Israel. His research interests include software evolution and program comprehension, especially what makes software hard to understand. In addition he has worked on parallel job scheduling and experimental performance evaluation, and maintains the Parallel Workloads Archive. His book on workload modeling has recently been published by Cambridge University Press.

1/4/2015 11:30CS, 337

Introducing Intel Software Guard Extensions (Intel SGX)

Ittai Anati

Intel

Intel Software Guard Extensions (Intel SGX) is an extension to Intel Architecture designed to increase the security of software. In this approach, rather than attempting to identify and isolate all the malware on the platform, legitimate software can be sealed inside an enclave and protected from attack by malware, irrespective of the its privilege level. In the talk I will touch on the building blocks of security, describe the basics of Intel SGX, and show how the components, combined, provide a holistic secure solution.

Bio:
Ittai Anati is a Senior Principal Engineer at Intel Corporation, focusing mainly on topics related to CPU and system security at Intel’s Israel Design Center (IDC) in Haifa. Ittai has a B.Sc. in Electrical Engineering from the Technion, Israel Institute of Technology.

26/3/2015 13:30CS, Taub 6

Deep learning with NVIDIA GPUs

Jonathan Cohen

NVIDIA Corporation

NVIDIA GPUs are powering a revolution in machine learning. With the rise of deep learning algorithms, in particular deep convolutional neural networks, computers are learning to see, hear, and understand the world around us in ways never before possible. Image recognition and detection systems are getting close to and in some cases surpassing human-level performance. I will talk about deep learning in the context of several new NVIDIA initiatives ranging from hardware platforms, software tools and libraries, and our recently announced DRIVE PX module for autonomous driving.

Bio:
Jonathan Cohen is Director of Engineering for NVIDIA’s GPU-accelerated deep learning software platform. Before moving to the product side of NVIDIA, Mr. Cohen spent three years as a senior research scientist with NVIDIA Research developing scientific computing and real-time physical simulation applications on NVIDIA’s massively parallel GPUs.

25/3/2015 11:30CS, Taub 6

Variability SmartBalance: A Sensing-Driven Linux Load Balancer for Energy Efficiency of Heterogeneous MPSoCs

Prof. Alex Nicolau

University of California, Irvine

A short overview of the NSF Variability Expedition will be given, followed by an overview of a particular result: SmartBalance.

Due to increased demand for higher performance and better energy efficiency, MPSoCs are deploying heterogeneous architectures with architecturally differentiated core types. However, the traditional Linux-based operating system is unable to exploit this heterogeneity since existing kernel load balancing and scheduling approaches lack support for aggressively heterogeneous architectural configurations (e.g. beyond two core types). In this paper we present SmartBalance: a sensing-driven closed-loop load balancer for aggressively heterogeneous MPSoCs that performs load balancing using a sense-predict-balance paradigm. SmartBalance can efficiently manage the chip resources while opportunistically exploiting the workload variations and performance-power trade-offs of different core types.
When compared to the standard vanilla Linux kernel load balancer, our per-thread and per-core performance-power-aware scheme shows an improvement in energy efficiency (throughput/Watt) of over 50% for benchmarks from the PARSEC benchmark suite executing on a heterogeneous MPSoC with 4 different core types and over 20% w.r.t.
state-of-the-art ARM’s global task scheduling (GTS) scheme for octa-core big.Little architecture.

Bio:
Alex Nicolau received his Ph.D. in Computer Science from Yale University in 1984, and served on the faculty of the computer science department at Cornell University until 1988. That year he joined the University of California, Irvine as an associate professor, where he serves as full professor since 1992 and department chair since 2013.

The author of over 300 conference and journal articles and many books, Alex chaired numerous international conferences (e.g., ACM International Supercomputing Conference (ICS) and Principles and Practice of Parallel Programming) and is editor in chief of the International Journal of Parallel Programming, the oldest journal in that field. He is also an IEEE Fellow.

18/3/2015 11:30EE, 1007

Random Graph Models for Random Key Predistribution in WSNs

Prof. Armand M. Makowski

University of Maryland

I will start with a quick background on wireless sensor networks (WSNs), and some of the security challenges created by their unique features. Random key pre-distribution schemes address some of the difficulties by randomly assigning secure keys to sensor nodes prior to network deployment.

The discussion will focus on two very different schemes, namely the scheme of Eschenauer and Gligor, and the random pairwise scheme of Chan et al. Each of these schemes induces a non-standard graph model which differs from the classical binomial random graphs of Erdos and Renyi. In each case the quest for “secure connectivity”
(under so-called full visibility) will be explored through zero-one laws for graph connectivity in the corresponding random graph model. Comparison with similar results for Erdos-Renyi graphs will be made. If time permits we will also discuss the partial visibility case.

This is joint work with former student Osman Yagan (now at CMU).

Bio:
Armand M. Makowski received the Licence en Sciences Mathematiques from the Universite Libre de Bruxelles in 1975, the M.S. degree in Engineering-Systems Science from U.C.L.A. in 1976 and the Ph.D. degree in Applied Mathematics from the University of Kentucky in 1981. In August 1981, he joined the faculty of the Electrical Engineering Department at the University of Maryland at College Park, where he is presently a Professor of Electrical and Computer Engineering.

His research interests broadly lie in applying advanced methods from the theory of stochastic processes to the modeling, design and performance evaluation of a variety of engineering systems, with particular emphasis on communication systems and networks. He is an IEEE Fellow.

18/2/2015 11:30EE, 861

Advance Reservation Games

Prof. David Starobinski

Boston University

Advance reservation (AR) services form a pillar of many branches of the economy, including transportation, lodging, dining, and, more recently, cloud computing. In this work, we use game theory to analyze an AR system in which customers differ in their lead (arrival) times and the number of customers requesting service is random. Based on statistical information, the customers decide whether or not making an advance reservation of server resources for a fee. We prove that only two types of Nash equilibria are possible: either none of the customers makes AR or only customers with lead time greater than some threshold make AR. Our analysis further shows that the fee that maximizes the provider’s profit may lead to other equilibria, one of which yielding zero profit. In order to prevent ending up with no profit, the provider can elect to advertise a lower fee yielding a guaranteed, but smaller profit. We refer to the ratio of the maximum potential profit to the maximum guaranteed profit as the price of conservatism. We prove that in the single server case the price of conservatism equals one, but can be arbitrarily high in a many-server system. Finally, we analyze the dynamics of AR games based on two learning models: action-learning and strategy-learning. We show that if the provider is risk-averse then action-learning yields higher profit, while if the provider is risk-taking then action-learning yields zero profit in contrast to strategy-learning.

Bio:
David Starobinski is a Professor of Electrical and Computer Engineering at Boston University, with a joint appointment in the Division of Systems Engineering. He is also a Faculty Fellow in the US DoT Volpe National Transportation Systems Center. He received his Ph.D. in Electrical Engineering from the Technion-Israel Institute of Technology, in 1999. In 1999-2000, he was a visiting post-doctoral researcher in the EECS department at UC Berkeley. In 2007-2008, he was an invited Professor at EPFL (Switzerland). Dr. Starobinski received a CAREER award from the U.S. National Science Foundation (2002), an Early Career Principal Investigator (ECPI) award from the U.S. Department of Energy (2004), the best paper award at the WiOpt 2010 conference, and the 2010 BU ECE Faculty Teaching Award. His research interests are in the modeling, performance evaluation, and security of communication networks.

28/1/2015 11:30EE, 1061

Designing Extremely Efficient Computers

Shahar Kvatinsky

Stanford

For decades technology scaling has decreased the cost of computing to the point where it can be included in almost anything. We have become accustomed to computing becoming faster, cheaper, and lower power, so we simply assume it will continue. While scaling has never been easy, a number of factors have made scaling increasingly difficult this past decade, and have caused power to become the principal constraint on performance. To continue scaling, new technologies (e.g., memristors, carbon nanotubes, resistive memories, magnetic memories) are considered these days as replacements to CMOS technology.

In this talk, I show that the way to achieve high energy efficiency is by designing hardware that is tightly matched with the application. This hardware can exploit novel characteristics of new technologies to complement CMOS technology, rather than completely replace CMOS.
I show how image processing algorithms that are used in many mobile devices share similar behavior that can be exploited to increase hardware efficiency, and how memory technologies can also perform logic operations, enabling non-von Neumann computers and tight integration with CMOS technology.

Bio:
Shahar Kvatinsky received the B.Sc. degree in computer engineering and applied physics and an MBA degree in 2009 and 2010, respectively, both from the Hebrew University of Jerusalem, and the Ph.D. degree in electrical engineering from the Technion – Israel Institute of Technology in 2014. From 2006 to 2009 he was with Intel as a circuit designer. He is currently a post-doctoral research fellow at Stanford University. His current research is focused on circuits and architectures with emerging memory technologies and design of energy efficient architectures.

26/1/2015 13:30CS, 337

Towards Automated Concurrent Memory Reclamation

Alex Matveev

MIT

The concurrent memory reclamation problem is that of devising techniques
to allow a deallocating thread to verify that no other concurrent
threads, in particular ones executing read-only traversals, have a
reference to the block being deallocated. To date, there is no
satisfactory solution to this problem: existing tracking methods like
hazard pointers, reference counters, or epoch-based RCU, are either
prohibitively expensive or require significant programming expertise, to
the extent that using them is worthy of a conference paper. None of the
existing techniques are automatic or even semi-automated.

Our research project takes a new approach to concurrent memory
reclamation: instead of manually tracking access to memory locations as
done in techniques like hazard pointers, or restricting reference
accesses to specific code boundaries as in RCU, we plan to use the
operating system and modern hardware’s transactional memory tracking
mechanisms to devise ways to automatically detect which memory locations
are being accessed. This will allow to design and implement a new class
of automated concurrent memory reclamation frameworks, making them
relatively easy to use and allowing them to scale, so that the design of
such structures can become more accessible to the non-expert programmer.

Bio:
Alex Matveev is a Postdoctoral Associate at MIT Computer Science and
Artificial Intelligence Laboratory, working as part of the Multicore
Algorithmics group. His main interests are practical and efficient
synchronization techniques for multi-core systems. In particular, he
works on hardware and software transactional memory, concurrent memory
reclamation, and operating system support that can provide new
multi-core programming paradigms.

21/1/2015 11:30CS, 337

Accurate Summation of Floating-Point Numbers on GPUs

Uri Verner

Computer Science, Technion

Two problems with parallel summation of floating-point numbers on GPUs
are loss of precision and non-reproducible results. The precision loss
is due to round-off error propagation, and the lack of bit-accurate
consistency across platforms and setups can be attributed to the
non-associative nature of floating-point addition.

To address these problems, we implemented a new method for efficient
bit-accurate parallel summation of double-precision numbers on GPUs.
This method provides the summation result with full precision, i.e.,
without round-off error. Thus, it provides the same result on all
architectures and execution setups. We see two main uses for this
method: (1) algorithms that benefit from extended precision, such as
iterative linear solvers (QUDA, AMGX), and (2) where reproducible
results are required, such as in cross-platform libraries, for tuning of
execution parameters with result validation, etc.

Bio:
Uri Verner is a PhD student at the Department of Computer Science in the Technion, where he is advised by Professors Assaf Schuster and Avi Mendelson. His research interests include theoretical and practical aspects of real-time data-stream processing in GPU-based systems, and his publications on the subject address computation and communication scheduling. After completing his BSc studies, Uri continued towards a PhD in the direct track in the same institution.

Uri interned at NVIDIA during Summer 2014, where he initiated the work presented in this talk.

14/1/2015 13:30CS, 337

Designing Technology for Blind People

Dr. Shiri Azenkot

Technion-Cornell

Blind people face challenges when using mainstream technology that relies heavily on visual feedback. Consider your mobile phone as an example. How would you interact with it without looking at the screen? In my talk, I’ll present the state-of-the-art in access technology and discuss social issues involved in its design. I’ll then discuss several of my research projects, including new methods and studies that explore how blind people can enter text efficiently on mobile devices. This work covers approaches for text entry using on-screen gestures (taps and swipes) and speech for input.

Bio:
Shiri Azenkot is an Assistant Professor at the Jacobs Technion-Cornell Institute at Cornell Tech. Her research is in human-computer interaction and accessibility. In summer 2014, she received a PhD in computer science from the University of Washington, where she was advised by Dr. Richard Ladner and Dr. Jacob Wobbrock. Shiri received the University of Washington Graduate School Medal, an NSF Graduate Research Fellowship, and an AT&T Graduate Research Fellowship. She holds a BA from Pomona College and an MS from the University of Washington, both in computer science.

7/1/2015 13:00CS, Taub 4

Write Once, Get 50% Free: Saving SSD Erase Costs Using WOM Codes

Dr. Gala Yadgar

CS Technion

NAND flash, used in modern SSDs, is a write once medium, where each memory cell must be erased prior to writing. The lifetime of an SSD is limited by the number of erasures allowed on each cell. Thus, minimizing erasures is a key objective in SSD design. A promising approach to eliminate erasures and extend SSD lifetime is to use Write Once Memory (WOM) codes, designed to accommodate additional writes on write once media. However, these codes inflate the physically stored data by at least 29%, and require an extra read operation before each additional write. This reduces the available capacity and I/O performance of the storage device, so far preventing the adoption of these codes in SSD design. We present Reusable SSD, in which invalid pages are reused for additional writes, without modifying the drive’s exported storage capacity or page size. Only data written as a second write is inflated, and the required additional storage is provided by the SSD’s inherent overprovisioning space. By prefetching invalid data and parallelizing second writes between planes, our design achieves latency equivalent to a regular write. This is joint work with Eitan Yaakobi and Assaf Schuster. The full paper will appear in FAST ’15.

Bio:
Gala Yadgar received her Ph.D. from the Computer Science Department at the Technion in 2012, under the supervision of Prof. Assaf Schuster and Dr. Michael Factor (IBM Research, Haifa). She is now a research associate at the Computer Science Department at the Technion.

4/1/2015 11:30EE, Meyer 1007

Pretty Bad Privacy: Pitfalls of DNS Encryption

Dr. Haya Shulman

Technische Universitat Darmstadt

As awareness for privacy of Domain Name System (DNS) is increasing, a number of mechanisms for encryption of DNS packets were proposed. We study the prominent defences, focusing on the privacy guarantees, interoperability with the DNS infrastructure, and the efficiency overhead. In particular:

– We explore dependencies in DNS and show techniques that utilise DNS specific side channel leaks allowing to infer information about the target domain in an encrypted DNS packet.

– We examine common DNS servers configurations and show that the proposals are expected to encounter deployment obstacles with (at least) 38% of 50K-top Alexa domains and (at least) 12% of the top-level domains (TLDs), and will disrupt the DNS functionality and availability for clients. We also show the implication of these configurations on adoption of DNSSEC.

– We show that due to the non-interoperability with the caches, the proposals for end-to-end encryption may have a prohibitive traffic overhead on the name servers.

Our work indicates that further study may be required to adjust the proposals to stand up to their security guarantees, and to make them suitable for the common servers’ configurations in the DNS infrastructure.

Bio:
Haya Shulman is a Claude Shannon network and systems security research group leader at Technische Universitat Darmstadt. Before that she was a postdoctoral researcher also at Technische Universitat Darmstadt. Haya conducted her Ph.D. at the Department of Computer Science, Bar Ilan University, Israel, in the network and cyber security group headed by Prof. Dr. Amir Herzberg. In 2011 and 2013 she received the ‘Checkpoint Institute for Information Security (CPIIS)’ awards, in 2013 she received the Feder prize for her research in communication technologies and an ICANN research fellowship. In 2014 Haya was awarded the Bar-Ilan university Rector prize for her achievements in research, and in 2015 she was awarded an IETF/IRTF Applied Networking Research Prize.

31/12/2014 11:30EE, Meyer 861

NAND Flash Architectures Reducing Write Amplification through Multi-Write Codes

Saher Odeh

Technion

Multi-write codes hold great promise to reduce write-amplification in flash-based storage devices. In this talk we propose two novel mapping architectures that show clear advantage over known schemes using multi-write codes, and over schemes not using such codes. To evaluate the performance gain we use industry-accepted benchmark traces, as well as synthetically-generated workloads with time locality. The results show write-amplification savings of double-digit percentages, for as low as 10% over-provisioning. In addition, we discuss an analytic model that accurately predicts the write-amplification given parameters describing both the device and the workload.

Bio:
Saher Odeh finished his BSc degree at the Computer Science faculty of the Technion, and is now completing his MSc degree at the Electrical Engineering faculty of the Technion (advised by Prof. Yuval Cassuto). In parallel, he is working at the Intel Corporation as Back-End integration systems engineer. His areas of interests are storage and distributed systems architecture.

22/12/2014 13:30CS, Taub 337

Crowdsourcing a Meeting of Minds

Dr. Michael Bernstein

Stanford University

Crowdsourcing is an increasingly powerful method for combining amateurs’ efforts to recreate an expert’s abilities. However, across domains from design to engineering to art, few goals are truly the effort of just one person – even one expert. If we can now crowdsource simple tasks such as image labeling, how might we coordinate many peoples’ abilities toward far more complex and interdependent goals? In this talk, I present computational systems for gathering and guiding crowds of experts — including professional programmers, designers, singers and artists. The resulting collectives tackle problems modularly and at scale, dynamically grow and shrink depending on task demands, and combine into larger organizations. I’ll demonstrate how these expert crowds, which we call Flash Teams, can pursue goals such as designing new user experiences overnight and producing animated shorts in two days.

Bio:
Michael Bernstein is an Assistant Professor of Computer Science at Stanford University, where he co-directs the Human-Computer Interaction group and is a Robert N. Noyce Family Faculty Scholar. His research in human-computer interaction focuses on the design of crowdsourcing and social computing systems. This work has received Best Paper awards and nominations at premier venues in human-computer interaction and social computing (ACM UIST, ACM CHI, ACM CSCW, AAAI ISWSM). Michael has been recognized with the NSF CAREER award, as well as the George M. Sprowls Award for best doctoral thesis in Computer Science at MIT. He holds Ph.D. and M.S. degrees in Computer Science from MIT, and a B.S. in Symbolic Systems from Stanford University.

03/12/2014CS, Taub 337

Making machine learning accessible with GraphLab

Dr. Danny Bickson

GraphLab

GraphLab started as research project at Carnegie Mellon University were our main goal was implementing machine learning methods in large scale, (and writing papers about it!). Despite thousands of users of our open source software (and many papers written), the experience of setting up the system and using a distributed system was quite frustrating, where our target audience was mainly double PhDs with machine learning and graph theory & distributed systems background.

When forming a company around GraphLab, we have decided to switch our focus and make machine learning much more accessible for the masses. In this talk I will summarize our experience in this direction. I will describe data structures we found useful for many commonly repeating computation patterns in machine learning, including support for dense data, graph (sparse) data, text records and even images. I will describe how we tackle the problem of wrapping up machine learning algorithms in a way which is user friendly to non experts, while algorithmic details can be tweaked layer. During the talk I will show many demos of cool applications that run on GraphLab.

About GraphLab: GraphLab is a popular open source big data analytics project initiated in 2009 at Carnegie Mellon University. GraphLab implements numerous machine learning algorithms using a highly efficient multicore machines and over clusters. A year and a half ago we have formed a company in Seattle to backup the open source project, and grown to a machine learning research group of around 12 PhDs. GraphLab is free for academic usages, and has a major open source component.

Bio:
Danny is holding a PhD in distributed systems at the Hebrew University (advised by Prof. Danny Dolev and Prof. Dahlia Malkhi), and was a postdoc at the machine learning department at Carngie Mellon University (advised by Prof. Carlos Guestrin and Prof. Joe Hellerstein from Berkeley). Danny was involved in the GraphLab project from its early stages and is a co-founder of the GraphLab company. His research interests are distributed algorithms and large scale machine learning.

26/11/2014EE, Meyer 1007

On the Scalability of Hop-by-hop Packet Routing

Gabor Retvari

Budapest University

Many of our computer networks, not the least of which the Internet, are built upon hop-by-hop destination-based routing. Here, network devices are equipped with a unique address and routers use giant lookup tables to forward packets towards the intended destination based on the address encoded in the header. At the moment, it is not clear whether we will be able to scale the hop-by-hop routing paradigm into the future economically, as the memory requirement for storing these giant lookup tables keeps on increasing at a fast pace. The main goal of this talk is to highlight some recent results in the related research field on routing scalability.

First, we present the fundamental impossibility result of the field, asserting that no routing scheme can achieve better than linearly
increasing routing tables in general. We extend this result from shortest-path routing to general routing policies using an algebraic
approach and, for the special case of BGP, we state a Separation Theorem that divides scalable BGP routing policies from unscalable ones. We then do a reality check: we show that IP forwarding tables used in Internet routers can be compressed way beyond what the worst-case analysis would suggest. To do so, we systematically squeeze out every single bit of redundancy from the most common forwarding table data structure, the prefix tree. We find that compression in this case not only does not ruin lookup performance but it even improves it, so this one turns out to be one of those rare cases in computer science when there is no space-time trade-off. Finally, we speculate about the reasons behind this seemingly magical compressibility of IP forwarding tables. We present preliminary analysis suggesting that some form of “heterogeneity” might be a main factor behind compressibility, we attempt to characterize this notion using the concept of certain routing table symmetries, and then we sketch the scalability map of the Internet which we then use to make some bold predictions.

Bio:
Gabor Retvari received the M.Sc. and Ph.D. degrees in electrical engineering from the Budapest University of Technology and Economics (BME).
He is now a Senior Research Fellow at the High Speed Networks Laboratory, BME. He has been passionate about the questions of routing scalability for a long time and he has done plenty of research in this field, using a diverse set of mathematical tools ranging from abstract algebra to computational geometry and, more recently, information-theory. He maintains numerous open source scientific tools written in Perl, C and Haskell.

19/11/2014CS, Taub 337

NFV: Virtualization Meets Telecommunications

TCE Guest Dr. Gabriel Silberman

Dell Research

This talk will introduce the Network Functions Virtualization (NFV) concept. NFV uses IT-based virtualization technologies to create classes of virtualized network functions, or VNFs, to serve as building blocks for complex communication services. NFV, with its ability to leverage virtualization techniques to run on standard servers and even the Cloud, features the ability to quickly develop and deploy new products and services, as well as a reliable, flexible and elastic platform.

The origins of NFV can be traced to a white paper published in October of 2012 by a workgroup from the European Telecommunications Standards Institute (ETSI). Since then, the NFV vision has captured the attention of both telecommunication service providers and their suppliers, with its potential for disrupting traditional development cycles and business models for providers and suppliers, as well as enabling a wide range of new applications within easy reach of developers.

NFV is seen by traditional telecommunications providers as a way to face the increasing challenge from newer and nimbler competitors such as Skype, Goggle Talk, and others. In particular, the use of NFV can accelerate product development and reduce costs by moving away from the long and costly development cycles typical of specialized proprietary systems, as well as creating an ecosystem of standard components for replacing old networks or building new functionality.

Dell Research, under the leadership of Wenjing Chu, has been working on several NFV initiatives and proofs of concept, together with a variety of technology partners and telcos. The most recent example is the High Velocity Cloud (HVC) proof of concept, shown to cut service deployment from months to minutes and capable of supporting a record 214Gb/s bandwidth while still leaving 90% of CPU available for network intensive workloads, when running on a single Dell R920 server. When considering mobile device traffic, a quarter rack with standard Dell servers, storage and networking equipment is capable of supporting the demands of a medium size city.

Bio:
Gabriel (Gabby) Silberman is Executive Director of Technology Strategy in Dell’s Research Division, responsible for University Alliances and Technology Outlook. He also holds an adjunct professor position with the Department of Computer Science and Engineering at Texas A&M University. Before joining Dell in 2013, Gabby created CA Labs, the research arm of CA Technologies, and later served as Chief Scientist at Owl Computing Technologies. Previous to this, he led the worldwide expansion of IBM’s Centers for Advanced Studies, and held various research and management positions at IBM Research and IBM Software Group. Before moving to industry, Gabby held a tenured faculty position at the Technion, Israel Institute of Technology, and was visiting faculty at Carnegie Mellon University.

Gabby earned Bachelor of Science and Master of Science degrees in computer science from the Technion, and a Ph.D. in computer science from the State University of New York at Buffalo. Gabby has more than 25 patents in process or awarded, 20 refereed journal publications, and has presented at 50+ international conferences and workshops.

12/11/2014EE, Meyer 861??

NFV – Uniform Handling and Abstraction of Hardware Accelerators

Zvika Bronstein

Toga Networks

Cloud technology is a dynamic and innovative field, which brings with it many changes in the way we perform large-scale computing, supply and consume data, and manage our data online. Recently, Cloud technology has begun to show it’s impact on the manner in which Service Providers (SPs) manage their networks. Today’s leading SPs have recThe NFV Infrastructure (NFVI) is composed mainly of standard IT servers and Network elements, such as routers and switches. However, some Virtual Network Functions (VNFs) may require some form of HW acceleration to be provided by the NFVI to meet their performance goals. While the standard IT based NFV architecture meets the performance goals of many virtualization use cases, some use cases are more challenging and require Network-Intensive (NI) or Compute- Intensive (CI) processing.

A unified approach and standard interfaces are proposed to allow multiple HWAs solutions to co-exist within the same NFVI and to enable VNF usage of HWA resources. Moreover, this approach will allow orchestration VNFs to HWA technologies for acceleration based on various metrics, such as resource availability and/or utilization. This is essential in order for the VNFs to meet portability requirements. As the VNFs being accelerated depend on and use application programmable interfaces (APIs), in the absence of a common HWA Abstraction Layer (HAL), the VNFs must be hard coded to the specific API utilized by the associated HWA.
Unified handling and Abstraction of Hardware Accelerators will facilitate total separation in selection of NFVI and VNF providers. A service provider will be free to choose optimal NFVI components and architecture based on its commercial and functionality requirements without having to identify and select the VNFs and services that will be deployed over the infrastructure a-priori.
ently announced their goal to move towards Network Function Virtualization (NFV), a major shift which means moving their services onto a large, self-managed distributed Cloud. This move will require major changes in today’s networks, both in terms of how the network functions are developed and deployed, as well as the overall management of the network.

Bio:
Zvika Bronstein is the CTO of Future Networks in Toga Networks,responsible for NFV strategy. Since the beginning of NFV standardization efforts, as part of the European Research Center, Zvika represented Huawei in ETSI NFV ISG (industry Standardization Group). Zvika focus in NFV included Use Cases, Performance Proof-Of-Concepts and Hardware Acceleration and Abstraction.
Before this current position, Zvika held various positions leading development projects in CPU and Data networking. His roles included leading the RD organization of 3Com in Israel and founding Atrica – a pioneer of Metro-Ethernet. Prior to that Zvika held some development positions in the Silicon Valley in HP and Daisy systems. Part of Zvika’s development work included the first Microprocessor research project in the Technion, early RISC development, and early Switches and routers. Zvika earned a Master of Science degree from Stanford University.

10/11/2014EE, Meyer 861?

Network Functions Virtualization (and other major shifts in networking) – a survey

Dr. Elisha Rosensweig

Alcatel-Lucent

Cloud technology is a dynamic and innovative field, which brings with it many changes in the way we perform large-scale computing, supply and consume data, and manage our data online. Recently, Cloud technology has begun to show it’s impact on the manner in which Service Providers (SPs) manage their networks. Today’s leading SPs have recently announced their goal to move towards Network Function Virtualization (NFV), a major shift which means moving their services onto a large, self-managed distributed Cloud. This move will require major changes in today’s networks, both in terms of how the network functions are developed and deployed, as well as the overall management of the network.

In this talk we will present a broad survey of the above changes. We will discuss the technological and business motivation for SPs to move to the Cloud, present a few concrete examples for this shift, and survey other related activities in the field of Future Internet Architectures (FIA).

Bio:
Elisha Rosensweig received his PhD from UMass Amherst in 2012, which focused on Content Oriented Networks. Since then, he has been working on Network Function Virtulization as a developer in CloudBand, Alcatel-Lucent. He has recently also been working on developing course material for NFV and teaching in Tel-Aviv University.

05/11/2014CS, Taub 337

[2 topics] Censorship in the Wild, Analyzing Internet Filtering in Syria — AND — Paying for Likes? Understanding Facebook Like Fraud Using Honeypots

Dr. Arik Friedman

NICTA Australia

(1) Censorship in the Wild, Analyzing Internet Filtering in Syria.
Several government authorities worldwide enforce Internet censorship, however, due to the lack of publicly available information and the inherent risks of performing active measurements, it is often hard for the research community to investigate censorship practices in the wild. In this talk I will present the methodology and the results of a measurement analysis of 600GB worth of leaked logs from 7 Blue Coat SG-9000 proxies – deployed in Syria to filter Internet traffic at a country scale. To the best of our knowledge, our work provides the first analytical look into Internet filtering in Syria.
This is joint work with Abdelberi Chaabane, Terence Chen, Mathieu Cunche, Emiliano De Cristofaro and Mohamed Ali Kaafar. (http://arxiv.org/abs/1402.3401)

(2) Paying for Likes? Understanding Facebook Like Fraud Using Honeypots.
Facebook pages offer an easy way to reach out to a very large audience as they can easily be promoted using Facebook’s advertising platform. Recently, the number of likes of a Facebook page has become a measure of its popularity and profitability, and an underground market of services boosting page likes, aka like farms, has emerged. In this talk I will present a comparative measurement study of page likes garnered via Facebook ads and by a few like farms, based on a set of deployed honeypot pages. I will highlight a few interesting findings, including that some farms seem to be operated by bots and do not really try to hide the nature of their operations, while others follow a stealthier approach, mimicking regular users’ behavior.
This is joint work with Emiliano De Cristofaro, Guillaume Jourjoin, Mohamed Ali Kaafar and M. Zubair Shafiq. (http://arxiv.org/abs/1409.2097)

Bio:
Dr. Arik Friedman is a senior researcher in NICTA’s Networks Research Group. He received a PhD from the Technion, Israel Institute of Technology, and MBA with specialization in Technology and Information Systems from Tel-Aviv University. His research interests include privacy-preserving data mining and computer security. He previously held the position of Program Manager at Microsoft R&D between 2007-2011, where he worked on privacy and security products.

29/10/2014EE, Meyer 1061

Enabling Peer-to-Peer Swarming for Multi-commodity

Prof. Daniel Sadoc Menasche

Federal University of Rio de Janeiro

Peer-to-peer swarming, as used by BitTorrent, is one of the de facto solutions for content dissemination in today’s Internet. By leveraging resources provided by users, peer-to-peer swarming is a simple, scalable and efficient mechanism for content distribution. Although peer-to-peer swarming has been widely studied for a decade, prior work has focused on the dissemination of one commodity (a single file). This work focuses on the multi-commodity case.

We have discovered through measurements that a vast number of publishers currently disseminate multiple files in a single swarm (bundle). The first contribution of this work is a model for content availability. We use the model to show that, when publishers are intermittent, bundling K files increases content availability exponentially as function of K. When there is a stable publisher, we consider content availability among peers (excluding the publisher). Our second contribution is the estimate of the dependency of peers on the stable publisher, which is useful for provisioning purposes as well as in deciding how to bundle. To this goal, we propose a new metric, swarm self-sustainability, and present a model that yields swarm self-sustainability as a function of the file size, popularity and service capacity of peers. Then, we investigate reciprocity and the use of barter that occurs among peers. As our third contribution, we prove that the loss of efficiency due to the download of unrequested content to enforce direct reciprocity, as opposed to indirect reciprocity, is at most two in a class of networks without relays.

Bio:
Daniel S. Menasche received his Ph.D. in computer science from the University of Massachusetts at Amherst in 2011. Currently, he is an Assistant Professor in the Computer Science Department at the Federal University of Rio de Janeiro, Brazil. His interests are in modeling, analysis and performance evaluation of computer systems. He was awarded the Yahoo! outstanding synthesis project award at UMass in 2010. He was also co-author of papers that received best paper awards at Globecom’07, CoNEXT’09 and INFOCOM’13.

23/10/2014CS, Taub 9

System Approach to Distributed Balanced Graph Partitioning

Dr. Gabi Kliot

Microsoft Research

Balanced Graph Partitioning is a hard problem. Doing it at large scale on graphs of millions of nodes and edges is even harder. Doing it in a distributed way makes the problem even more challenging. And finally, dong it with linear or even sub linear time and space complexity may sound like pushing the limits too far. In this talk I will present our practical approach to this hard problem motivated by the systems we build and describe two algorithms that solve it in two different settings. The first operates on a static graph where nodes arrive one by one in a streaming fashion. The second operates on a dynamic, constantly changing graph with random access to nodes and edges. I will also describe how those algorithms were used in two different large distributed systems that we built: Horton – distributed graph database and Orleans – distributed actor based middleware.

Bio:
Gabriel Kliot obtained his PhD in Computer Science from the Technion in 2009, working on Distributed Systems and Networking. Since then he has been with Microsoft Research realizing his dream of bringing distributed computing to the masses.

YouTube Link

6/10/2014EE, Meyer 861

A Centralized “Zero-Queue” Network Architecture

Jonathan Perry

MIT, CSAIL

Current datacenter networks inherit the principles that went into the design of the Internet, where packet transmission and path selection decisions are distributed among the endpoints and routers.
Instead, we propose that each sender should delegate control — to a centralized arbiter — of when each packet should be transmitted and what path it should follow.

Fastpass is a datacenter network architecture built using this principle. Fastpass incorporates two fast algorithms: the first determines the time at which each packet should be transmitted,
while the second determines the path to use for that packet. In addition, Fastpass uses an efficient protocol between the endpoints and the arbiter, and an arbiter replication strategy for
fault-tolerant failover. We deployed and evaluated Fastpass in a portion of Facebook’s datacenter network. Our results show that Fastpass achieves high throughput comparable to current networks at a
240x reduction is queue lengths, achieves much fairer and consistent flow throughputs than the baseline TCP,
is able to schedule 2.21 Terabits/s of traffic in software on eight cores, and reduces the number of TCP retransmissions in a latency-sensitive service at Facebook by a 2.5x.

Bio:
Jonathan received a B.Sc. in Computer Science from Tel-Aviv University in 2003, after which he worked for 7 years in communication systems R-and-D and HPC algorithm development in an army technological unit.
He is currently a Ph.D student in MIT CSAIL’s Networks and Mobile Systems group, advised by Hari Balakrishnan and Devavrat Shah.

28/5/2014EE, Meyer 861

Games, Numeration Systems and Data Structures

Prof. Aviezri S. Fraenkel

Weizmann Institute of Science

A primary aim of combinatorial game theory (CGT) is to formulate tractable winning strategies for games. I wish to sell you the idea that sometimes the only known method to do so is via a judiciously chosen numeration system, analogously to the choice of an appropriate data structure for optimization problems. We will also see that numeration systems may be conducive to solve elegantly other problems in math.

Bio:
EE Technion graduate. PhD in math UCLA. Affiliation: Compu Sci and Appl Math, Weizmann Institute of Science. Prof. Aviezri S. Fraenkel is the recipient of the Euler Medal (2005), WEIZAC Medal (2006) and Israel Prize (2007).

**The Lecture will be followed by a colloquium of Prof. Aviezri S. Fraenkel titled “From Weizac to Responsa (Shut)”**

19/5/2014CS, Taub 337

A Look at Multiprocessor Real-Time Scheduling in Linux: Generality, Feasibility, and Scalability

Bjorn Brandenburg

Max Planck Institute for Software Systems in Kaiserslautern, Germany

Given the popularity of Linux and the increasing adoption of embedded multicore platforms, Linux (and Linux-like RTOSs such as LynxOS and QNX) are increasingly used to host real-time workloads on multiprocessors. This naturally exposes new limitations, both in the actual implementation and in the supporting analytical foundations, and has thus been a driving force for a number of recent advances in multiprocessor real-time scheduling.

In this talk, I’m going to present two such results: first, I’m going to discuss how Linux’s notion of processor affinities turned out to be strictly more general than many of the models typically studied in the real-time literature. And second, I’ll discuss the (lack of) scalability of the Linux scheduler on large multicore platforms in terms of average- and worst-case overheads, and present an alternative, much more scalable design based on message passing.

Bio:
Bjorn Brandenburg is a tenure-track research group leader at the Max Planck Institute for Software Systems (MPI-SWS) in Kaiserslautern, Germany. He joined MPI-SWS in 2011 after obtaining his PhD from the University of North Carolina at Chapel Hill, where he worked with Jim Anderson. His research is focused on operating systems for predictable multiprocessor real-time systems and spans from the analytical foundations on the one hand to practical systems and implementation issues on the other hand, with a special focus on real-time locking and scheduling. He is the lead developer and maintainer of LITMUS^RT (http://www.litmus-rt.org), a real-time extension of the Linux kernel that has been in development since 2006.

14/5/2014 12:40EE, Meyer 861

Redundancy Elimination in Networked Systems (CE Department Seminar)

Eyal Zohar

Technion, Electrical Engineering

The surge in big files and rich media usage increases the importance of Redundancy Elimination (RE) in network traffic.
Our work explores the design, implementation, and deployment of two different and complementary lossless RE technologies – data
deduplication and compression. We devise and construct scalable deduplication and compression systems for various environments,
and analyze their performance using real data.

In the first part of this talk, we present PACK (Predictive ACKs), a novel end-to-end network level deduplication (a.k.a Traffic Redundancy Elimination)
system, designed for cloud computing cost reduction.

In the second part we present a novel elastic compression framework that adapts quickly to changing load conditions in web-servers.
The new framework responds to changing conditions within seconds, and also mixes compression levels for fine-grained operation.

PhD. Seminar.
Advisors: Prof. Yuval Cassuto and Prof. Israel Cidon.

Bio:
Eyal is a PhD student of Electrical Engineering at the Technion under the supervision of Prof. Yuval Cassuto and Prof. Israel Cidon.
He’s interested in Computer Networks, specifically P2P networks, content caching, cloud computing, cellular networks, and redundancy elimination.

14/5/2014CS, Taub 337

Robust Replication (or how I learned to stop worrying and love failures)

Allen Clement

Max Planck Institute for Software Systems

The choice between Byzantine and crash fault tolerance is viewed as a fundamental design decision when building fault tolerant systems. We show that this dichotomy is not fundamental, and present a unified model of fault tolerance in which the number of tolerated faults of each type is a configuration choice. Additionally, we observe that a single fault is capable of devastating the performance of existing Byzantine fault tolerant replication systems. We argue that fault tolerant systems should, and can, be designed to perform well even when failures occur. In this talk I will expand on these two insights and describe our experience leveraging them to build a generic fault tolerant replication library that provides flexible fault tolerance and robust performance. We use the library to build a (Byzantine) fault tolerant version of the Hadoop Distributed File System.

Bio:
Allen Clement is a Tenure-track faculty member at the Max Planck Institute for Software Systems. He received a Ph.D. from the University of Texas at Austin in 2010 and an A.B. in Computer Science from Princeton University. His research focuses on the challenges of building robust and reliable distributed systems. In particular, he has investigated practical Byzantine fault tolerant replication, systems robust to both Byzantine and selfish behaviors, consistency in geo-replicated environments, and how to leverage the structure of social networks to build Sybil-tolerant systems. When not in the lab he plays way too much ultimate and is coach for the German Mixed National Team.

7/5/2014CS, Taub 337

The Complexity of Correlated Instances

Irit Dinur

Weizmann Institute of Science

We study the complexity of computational problems when multiple instances are given, and the instances are *correlated*. Such instances arise for example when they are derived from one “underlying source”.

Does this make the problem easier?

We study this question in the domain of constraint satisfaction problems (CSPs). For example: given several CSP instances with solutions that are 99% the same, and for which the constraints are 95% the same, is it easier to find the solutions?

We investigate both the case where we are told which bits differ between the different solutions, and the possibly more interesting case where the differences are unknown.

For several variants of CSPs, we show that significant computational advantage can be gained when solving correlated instances.

We believe that correlation is a natural phenomena in practice, and exploiting correlation has potential widespread algorithmic benefits beyond CSPs. Our work suggests exploring correlation as a new dimension for achieving algorithmic goals.

Joint work with Shafi Goldwasser and Huijia Rachel Lin.

Bio:
Irit Dinur is a theoretical computer scientist at the Weizmann Institute. She received her PhD in 2001 from Tel-Aviv University. She has recently-ish returned from two years on sabbatical in Boston where the above work was done.

30/4/2014EE, Meyer TBA

Sampling and Inference Problems for Big Data in the Internet and Beyond

Nick Duffield

Rutgers University

Massive graph datasets are used operationally by providers of internet, social network and search services.
Sampling can reduce storage requirements as well as query execution times, while prolonging the useful life of the data for baselining and
retrospective analysis. Here, sampling must mediate between the characteristics of the data, the available resources, and the accuracy needs of queries.
Inference methods can be used to fuse datasets which individually provide only an incomplete view of the system under study.
In this talk we describe some successes in applying these ideas to massive Internet measurements
and some potential new applications to inverse problems in urban informatics, and to bioinformatics.

Bio:
Nick Duffield joined Rutgers University as a Research Professor in 2013. Previously, he worked at AT &T Labs Research,
where he was a Distinguished Member of Technical Staff and an AT &T Fellow, and held faculty positions in Europe.
He works on network and data science, particularly the acquisition, analysis and applications of operational network data.
He was formerly chair of the IETF Working Group on Packet Sampling, and an associate editor of the IEEE/ACM Transactions on Networking.
He is an IEEE Fellow and was a co-recipient of the ACM Sigmetrics Test of Time Award in both 2012 and 2013 for work in network tomography.
He was recently TPC Co-Chair of IFIP Performance 2013, a keynote speaker at the 25th International Teletraffic Congress in Shanghai, China,
and an invited speaker at the workshop on Big Data in the Mathematical Sciences in Warwick, UK.

23/4/2014EE, Meyer 861

Look-Ahead Clock Gating

Shmuel Wimer

Bar Ilan University

Clock gating is very useful for reducing the power consumed by digital systems.
Three gating methods are known: synthesis-based, data-driven and auto-gated FFs (AGFF).
We present a novel method called Look-Ahead Clock Gating (LACG), which combines all the three.
LACG computes the clock enabling signals of each FF one cycle ahead of time, based on the present cycle data of those FFs on which it depends.
It avoids the tight timing constraints of AGFF and data-driven by allotting a full clock cycle for the computation of the enabling signals and their
propagation. A closed-form model characterizing the power saving per FF is presented.
The model implies a breakeven curve, dividing the FFs space into two regions of positive and negative gating return on investment.
While the majority of the FFs fall in the positive region and hence should be gated, those falling in the negative region should not.
Experimentation on industry-scale data showed 23% reduction of the clock power, on top of other gating methods.

Bio:
Shmuel Wimer is an Associate Professor with the Engineering Faculty of Bar-Ilan University and a
visiting Associate Professor with the EE Dept. of the Technion. He received M.Sc. in mathematics from Tel-Aviv University,
and D.Sc. in EE from the Technion. Prior to joining the academia on 2009 he worked for 32 years at the industry for Intel, IBM,
National Semiconductors and the Israeli Aerospace Industry.
He is interested in optimization of VLSI circuits and systems and in combinatorial optimization.

8/4/2014CS, Taub 337

NetFPGA: The Flexible Open-Source Networking Platform

Noa Zilberman

University of Cambridge

The NetFPGA is an open platform enabling researchers and instructors to build high-speed, hardware-accelerated networking systems.
The NetFPGA is the de-facto experimental platform for line-rate implementations of network research and it has a family of boards,
supporting from 1GE to 100GE.

The target audience is not restricted to hardware researchers: the NetFPGA provides the ideal platform for research across a wide range of networking
topics from architecture to algorithms and from energy-efficient design to routing and forwarding.
The most prominent NetFPGA success is OpenFlow, which in turn has reignited the Software Defined Networking movement.
NetFPGA enabled OpenFlow by providing a widely available open-source development platform capable of line-rate and was, until its commercial uptake,
the reference platform for OpenFlow.
NetFPGA enables high-impact network research.

Bio:
Noa Zilberman is a Research Associate in the System Research Group,
University of Cambridge’ Computer Laboratory. Her research interests include
open-source network & computing research using the NetFPGA platform, network
performance, routing and switching architectures, memories architecture and
performance, Internet measurements and topology. In her last roles before
joining the System Research Group, she was a researcher in the DIMES project
and a chip architect in Broadcom’s Network Switching group.

2/4/2014EE, Meyer TBA

Low-Congestion Distributed Algorithms

Boaz Patt-Shamir

Tel Aviv University

The traditional model for computing over a communication network
(called LOCAL) allows sending a message of arbitrary size in a single
time step. This way, the time complexity is a measure of the locality
of algorithms: saying that an algorithm runs in time T is equivalent,
under the LOCAL model, to saying that the problem can be solved if
each node learns all information the nodes which are reachable within
T hops. Therefore, in this model any problem can be solved in time
linear in the network diameter.

While work on the LOCAL model has produced many interesting results,
it is widely accepted that this model does not capture the true
complexity of distributed computing: it is mainly useful in
understanding what cannot be done distributively, i.e., lower
bounds. A better approximation of reality is the CONGEST model, where
a link can carry only a bounded number of bits in a time
step. Usually, it is assumed that message size is O(log n) bits, so
that each message can carry a constant number of reasonably-sized
pieces of information, such as node identifiers, or values of
polynomial magnitude. It turns out that in this model, many problems
cannot be solved in o(sqrt(n)) time, even in networks of diameter,
say, O(log n) hops. On the other hand, letting D denote the network
diameter in hops, there are some problems in which the O(D+sqrt(n))
time upper bound is nearly achievable in the CONGEST model (to within
a polylogarithmic factor).

In this talk we review some known results in the CONGEST model, as
well as some new progress directions. In particular, we will consider
the tasks of constructing routing tables, the task of finding a
Steiner forest, and the extreme case of a completely connected network
(a clique).

Bio:
Boaz Patt-Shamir is a Professor of Computer Science in the School of
Electrcal Engineering in Tel Aviv University since 1997, where he
directs the laboratory for distributed algorithms. Prior to that, he
was a professor in Northeastern University in Boston. He received his
BSc in Mathematics and Computer Science from Tel Aviv University, his
MSc from Weizmann Institute, and his PhD from MIT.

His main research interest is network algorithms, including
self-stabilization, clock synchronization, routing, scheduling, and
recommender systems. While on sabbatical in HP labs in Cambridge (Ma.),
he helped to develop an experimental corporate-intranet search engine
based on his ideas for recommender systems.

He has served as the program committee chair for ACM PODC 2008 and
SIROCCO 2010, and gave numerous invited talks in scientific conferences.

19/3/2014CS, Taub 7

The Immune System, and How it Can Teach Us Cyber Security

Jacob Rimer

Weizmann Institute of Science

The immune system has several roles: protection against parasitism by viruses, bacteria and foreign or aberrant cells; repair of organ and tissue
damage; and maintenance of integrity. Hence, beside ongoing routine tasks, it needs to be prepared for unforeseen – even unforeseeable – troubles.
However, effective immunity has to be economical; investment in immunity must be balanced with other fitness traits.
An organism needs to eat, grow, reproduce and so on. Our immune system is flexible, robust, affordable and highly distributed.
Moreover, it has the ability to learn from its own experience and adapt to new challenges.

Since we became to depend on cyber in so many aspects of our lives, we need to maintain network integrity and protect it from “cyber parasites”.
Evidently, our current cyber security tools are not enough to face the mounting challenges of the modern cyber era.
Which lessons can we learn from the immune system to enhance our protections?

Bio:
Jacob Rimer received his M.Sc. degree in Computer Science from the Weizmann Institute, as the first “Open University” graduate that was
accepted to the graduate school. After a career in the Hi-tech industry, he is about to finalize his PhD studies in immunology at the Weizmann
Institute. His research focuses on decision making in the immune system, in particular during CD4+ T cell differentiation.
He is also specialized in Cyber security, Big data, Machine learning, and more.

12/3/2014CS, Taub 601

Fence-Free Work Stealing on Bounded TSO Processors

Adam Morrison

Technion, Computer Science

Work stealing is the method of choice for load balancing in task parallel programming languages and frameworks. Yet despite considerable effort invested in optimizing work stealing task queues, existing algorithms issue a costly memory
fence when removing a task, and these fences are believed to be necessary for correctness.

We will refute this belief, demonstrating fence-free work stealing algorithms for microarchitectures with a bounded total store ordering (TSO) memory model. Bounded TSO is a novel restriction of TSO—which nevertheless captures mainstream x86 and SPARC TSO processors—in which a load can only be reordered with a bounded number of prior stores.

Bio:
Adam Morrison is an Aly Kaufman Post Doctoral Fellow at the Technion, hosted by Hagit Attiya. His research focuses on bridging between the theory and practice of distributed programming—particularly concurrency and synchronization. He did his PhD work at Tel Aviv University, during which he was awarded an IBM PhD Fellowship as well as Intel and Deutch prizes.

5/3/2014EE, Meyer 861

Accumulating Automata and Cascaded Equations Automata for
Communicationless Secure Multi-Party Computation

Prof. Shlomi Dolev

Dean of the Faculty of Natural Sciences, Ben-Gurion University of the Negev

Information theoretically secure multi-party
computation implies severe communication overhead
among the computing participants, as there is a need to
reduce the polynomial degree after each multiplication. In
particular, when the input is (practically) unbounded, the
number of multiplications and therefore the communication
bandwidth among the participants may be practically
unbounded. In some scenarios the communication among
the participants should better be avoided altogether, avoiding
linkage among the secret share holders. For example,
when processes in clouds operate over streaming secret
shares without communicating with each other, they can
actually hide their linkage and activity in the crowd. An
adversary that is able to compromise processes in the cloud
may need to capture and analyze a very large number of
possible shares.

Consider a dealer that wants to repeatedly compute
functions on a long file with the assistance of m servers.
The dealer does not wish to leak either the input file or
the result of the computation to any of the servers. We
investigate this setting given two constraints. The dealer is
allowed to share each symbol of the input file among the
servers and is allowed to halt the computation at any point.
However, the dealer is otherwise stateless. Furthermore,
each server is not allowed any communication beyond the
shares of the inputs that it receives and the information it
provides to the dealer during reconstruction.

We present a protocol in this setting for generalized
string matching, including wildcards. We also present
solutions for identifying other regular languages, as well as
particular context free and context sensitive languages. The
results can be described by a newly defined accumulating
automata and cascaded equations automata which may be of
an independent interest. As an application of accumulating
automata and cascaded equations automata, secure and
private repeated computations on a secret shared file
among communicationless clouds are also presented.

Joint work with Niv Gilboa and Ximing Li.

Bio:
Shlomi Dolev received his B.Sc. in Engineering and B.A. in Computer Science in 1984 and 1985, and his M.Sc. and D.Sc. in computer Science in 1990 and
1992 from the Technion Israel Institute of Technology.
From 1992 to 1995 he was at Texas A&M University postdoc of Jennifer Welch. In 1995 he joined the Department of Mathematics and Computer Science
at Ben-Gurion University where he is now a full professor. He was a visiting researcher/professor at MIT, DIMACS, and LRI, for several periods
during summers. Shlomi is the author of the book “self-stabilization” published by the MIT Press.
He published more than two hundreds journal and conference scientific articles, and patents.
Shlomi served in the program committee of more than 90 conferences, chairing the ACM Symposium on Principles of Distributed Computing,
and the International Symposium on DIStributed Computing.

He is an associate editor of the IEEE Transactions on Computers. His research grants include IBM faculty awards, Intel academic grants, Verisign,
ISF, US Airforce, EU and NSF grants. Shlomi is the founding chair of the computer science department at Ben-Gurion University,
where he now holds the Rita Altura trust chair in computer science.

His current research interests include distributed computing, distributed systems, security and cryptography and communication networks;
in particular the self-stabilization property of such systems. Recently, he is involved in optical computing and brain science research.
Today, Prof. Dolev is the dean of the faculty of natural sciences and the head of (IUCC) the Inter University Computation Center of Israel.

12/2/2014EE, TBA

Internet: a Critical Infrastructure for Society.

Janos Tapolcai

Budapest University of Technology and Economics

The presentation provides an overview on some short and long term solutions in metro and backbone networks that can facilitate reaching the next level of
reliability, which can enable the Internet to become an always operating and fast communication system for the society.
In particular I am focusing on fast failure localization and restoration approaches in optical backbone networks,
and on the failure recovery techniques at the IP layer.
Furthermore, I introduce a future direction on IP router design which enables the compression of the routing tables.

Bio:
Janos Tapolcai received his M.Sc. (’00 in Technical Informatics), Ph.D. (’05 in Computer Science) degrees from Budapest University of Technology and Economics (BME),
Budapest, and D.Sc. (’13 in Engineering Science) from Hungarian Academy of Sciences (MTA).
Currently he is an Associate Professor at the High-Speed Networks Laboratory at the Department of Telecommunications and Media Informatics at BME.
His research interests include applied mathematics, combinatorial optimization, optical networks and IP routing, addressing and survivability.
He is an author of over 100 scientific publications, he is the recipient of the Google Faculty Award and Best Paper Award in ICC’06, in DRCN’11.
He is a TPC member of leading conferences such as IEEE INFOCOM (2012 – 2014), and is a winner of Hungarian Academy of Sciences Momentum (Lendulet) Program.

22/1/2014EE, 861

Almost Optimal Virtual Machine Placement for Traffic Intense Data Centers.

Liane Lewin-Eytan

Yahoo! Labs

The recent growing popularity of cloud-based solutions and the variety of new applications present new challenges for cloud management and resource utilization.
In this paper we concentrate on the networking aspect and consider the placement problem of virtual machines (VMs) of applications with intense
bandwidth requirements. Optimizing the available network bandwidth is far more complex than optimizing resources like memory or CPU,
since every network link may be used by many physical hosts and thus by the VMs residing in these hosts.
We focus on maximizing the benefit from the overall communication sent by the VMs to a single designated point in the data center (called the root).
This is the typical case when considering a storage area network of applications with intense storage requirements.

We formulate a bandwidth-constrained VM placement optimization problem that models this setting.
This problem is NP hard, and we present a polynomial-time constant approximation algorithm for its most general version,
in which hosts are connected to the root by a general network graph.
For more practical cases, in which the network topology is a tree and the benefit is a simple function of the allocated bandwidth,
we present improved approximation algorithms that are more efficient in terms of running time.
We evaluate the expected performance of our proposed algorithms through a simulation study over traces from a real production data center,
providing strong indications to the superiority of our proposed solutions.

Bio:
Liane Lewin-Eytan received her B.Sc, M.Sc, and Ph.D. from the Dept. of Electrical Engineering, Technion – Israel Institute of Technology,
Haifa in 1999, 2001 and 2008, respectively. She is currently a Research Scientist at Yahoo Labs.
During the years 2009-2013, she was a Research Staff Member at IBM Research – Haifa, and worked among others on smart grid networks,
cloud networking and network virtualization. Among her research interests are non-cooperative networking games,
dynamic and online network analysis, network virtualization, data science and machine learning.

15/1/2014CS, 337

OS Services for Computational Accelerators

Mark Silberstein

Technion, Electrical Engineering

Future applications will need to use computational accelerators like GPUs
to achieve their performance and power goals. However building efficient
systems that use accelerators today is incredibly difficult.
The main problem lies in the lack of appropriate OS support —
while OSes provide optimized resource management and Input/Output (I/O)
services to CPU applications, they make no such services available to
accelerator programs.

In this talk I will discuss my ongoing work on building an Operating System
layer for GPUs that enables access to files and network services directly from
programs running on GPUs. I will describe the existing infrastructure, my ongoing work, open questions and future directions.

The talk is self-contained, no background in GPU computing is necessary.

Bio:
Mark Silberstein is an Assistant Professor at EE, Technion.
He truly believes that building practical, programmable and efficient computer systems with computational accelerators requires cross-cutting changes in system interfaces, OS design, hardware security mechanisms, storage and networking services, as well as programming models and parallel algorithms.
He is eagerly looking for excellent graduate students to join his efforts on emancipation of accelerators in computer systems.
Web page: https://sites.google.com/site/silbersteinmark

8/1/2014CS, 337

New Interfaces to Storage-Class Memory

Michael Swift

University of Wisconsin, Madison

Storage-class memory (SCM) technologies such as phase-change memory, spin-transfer torque MRAM, and memristers promise the performance and flexibility of DRAM with the persistence of flash and disk. In this talk, I will discuss two interfaces to persistent data stored in SCM.

First, I will talk about Mnemosyne, which is a system that exposes storage-class memory directly to applications in the form of persistent regions. With only minor hardware changes, Mnemosyne supports consistent in-place updates to persistent data structures and performance up to 10x faster than current storage systems.

Second, I will talk about how to build file systems for storage-class memory. While standard storage devices rely on the operating system kernel for protected shared access, SCM can use virtual-memory hardware to protect access from user-mode programs. This enables application-specific customization of file system policies and interfaces. I will describe the design of the Aerie file system for SCM, which provides flexible high-performance access to files.

Bio:
Mike Swift is an associate professor at the University of Wisconsin, Madison. His research focuses on the hardware/operating system boundary, including devices drivers, new processor/memory technologies, and transactional memory. He grew up in Amherst, Massachusetts and received a B.A. from Cornell University in 1992. After college, he worked at Microsoft in the Windows group, where he implemented authentication and access control functionality in Windows Cairo, Windows NT, and Windows 2000. He received a Ph.D. on operating system reliability from the University of Washington in 2005.

1/1/2014EE, 861

Elections 2013: Sublinear and Universal Leader Elections

Amitabh Trehan

Queen’s University of Belfast

In this talk, I will discuss results from our recent work of last year (i.e. 2013!) on the classic problem of Leader Election.
It is perhaps surprising that this classic problem still yields such rich results. We look at Leader Election in the synchronous message passing setting.
Most of this work deals with randomization and we have discovered definitive lower bounds and efficient algorithms.

Majority of this talk will focus on [1], in which we introduce the first sublinear messages randomized Leader Election (w.h.p.) that runs in O(1) time on complete networks and mixing time on any connected graph.
We also show a lower bound of sqrt(n) for any leader election algorithm that succeeds with probability at least $1/e + \eps$, for any small constant $\eps > 0$.
We will also, time permitting, briefly touch upon the slew of results we obtained in [2] for Universal Leader Election (algorithms that work for all graphs) which includes formal proofs of fundamental lower
bounds for time and messages in randomized algorithms and randomized and deterministic algorithms that try to simultaneously achieve both.

[1] Sublinear Bounds for Randomized Leader Election. ICDCN 2013, Best Paper Award

[2] On the complexity of universal leader election. PODC 2013

Joint works with Shay Kutten, Gopal Pandurangan, David Peleg, and Peter Robinson

Bio:

Amitabh Trehan is a Lecturer (Assistant Professor) in Computer Science at Queen’s University Belfast (High Performance and Distributed Computing cluster).
Before joining QUB, he held an I-CORE postdoc fellowship with Danny Dolev at Hebrew University. Recently, he was also awarded the prestigious Newton Incoming Fellowship of the Royal Society, UK, which he had to decline in favour of his present position.
He has also held a postdoc with Shay Kutten and Ron Lavi at the information systems group at the faculty of Industrial Engineering at Technion and a postdoc with Valerie King (at UVIC, Canada). His Ph.D. Advisor was Jared Saia (University of New Mexico, USA) with whom he worked on algorithms for self-healing networks.

His broad research interests are in theory and algorithms with specific interests in distributed algorithms, networks, and game theory. His interest includes designing efficient distributed algorithms for robustness/self-healing/
self-* properties in systems under adversarial attack, classic distributed problems such as Leader Election, and studying game theoretic and other mechanisms for evolving networks, such as social networks or distributed systems (P2P networks etc).

25/12/2013EE: Meyer 1061

Resource Placement and Assignment in Distributed Network Topologies.

Yuval Rochman

Tel Aviv University/Marvell

We consider the problem of how to place and ef?ciently utilize resources in network environments. The setting consists of a regionally organized system which must satisfy
regionally varying demands for various resources. The operator aims at placing resources in the regions as to minimize the cost of providing the demands. Examples of systems falling under this paradigm are 1) A peer supported Video on Demand service
where the problem is how to place various video movies, and 2) A cloud-based system consisting of regional server-farms, where the problem is where to place various contents or end-user services.
The main challenge posed by this paradigm is the need to deal with an arbitrary multi-dimensional (high-dimensionality) stochastic demand. We show that, despite this complexity, one can optimize the system operation while accounting for the full
demand distribution.

We provide algorithms for conducting this optimization and show that their complexity is pretty small, implying they can handle very large systems. The algorithms can be used for: 1) Exact system optimization, 2) deriving lower bounds for heuristic based analysis, and 3) Sensitivity analysis. The importance of the model is demonstrated by showing that an alternative analysis which is based on the demand means only, may, in certain cases, achieve performance that is drastically worse than the optimal one.

Bio:
Yuval Rochman is currently a Ph.D. student in Computer Science at Tel-Aviv University and working as a researcher at Marvell Research. His research interests are in modeling, design and in performance analysis of Cloud Computing and VoD.

18/12/2013EE: Meyer 1007

Shortest Path Queries: Static, Dynamic and Fault-tolerant.

Dr. Shiri Chechik

Microsoft Research, Silicon Valley

In recent years, the practical need for fast retrieval of shortest path queries has significantly increased,
in part due to the developing GPS navigation systems and other route planning softwares.
Classical shortest paths algorithms such as Dijkstra’s algorithm return shortest path distance in almost linear time in the number of nodes.
However, continent-sized road networks require something much faster. It is possible to achieve sublinear-time queries if preprocessing is allowed.

A distance oracle is a data structure that allows fast retrieval of a distance estimate for any pair of nodes.
The focus in designing distance oracles is often on the tradeoff between several parameters: the construction time (the time it takes to construct
the distance oracle), the size of the data structure,
the query time and the quality of the answer (how close is the estimated distance to the real distance).
Not only are distance oracles important structures by their own right with both practical and theoretical interest, but they also have strong-ties
to numerous other problems in theory and distributed computing, such as spanners, compact routing schemes and low-stretch spanning trees.
A major complicating aspect is that real-world networks may change over time. The changes may be either permanent (e.g., some road is added to the
network) or temporary (e.g., some roads may be closed for short periods or a major junction may be temporary blocked).
A dynamic setting is used to capture permanent changes to the network, whereas a fault tolerance setting captures temporary changes.

In the first part of the talk I will highlight some of my work on shortest path queries in static, dynamic and fault-tolerant settings.
In the second part of the talk I will discuss my recent work on constant time distance oracles,
which essentially obtains optimal bounds for general graphs and improves over the Thorup-Zwick distance oracle [STOC ’01, J.ACM ’05]

Bio:
Shiri is a postdoctoral researcher at Microsoft Research Silicon Valley. She completed her Ph.D. under the supervision of Prof. David Peleg at the
Weizmann Institute of Science. Her main research interests are in the design and analysis of combinatorial algorithms, with a special focus on algorithms at the interface of distributed computing, networking and data structures.
She has published papers on a variety of algorithmic topics including graph spanners, distance oracles/labeling, compact routing, dynamic algorithms, approximation algorithms, and social networks.

27/11/2013EE: Meyer 861

Cognitive Users with Useful Vacations.

Dr. Boris Oklander

Imperial College London

Admission control is a classical topic in communications, but cognition as a means to enhance the performance of users and the network is relatively
new. Cognitive Users’ (CU) can achieve better performance based on smart access to resources such as power and spectrum.
However, adaptation to dynamically changing conditions of communication channels is a challenge for CUs.
Acting as a secondary user (SU), a CU cedes priority to ongoing transmissions of PUs and accesses a channel only after sensing and estimating that it
is not used by PUs.
Thus efficient CUs will trade between competing requirements of sensing, transmitting and interference avoidance, subject to power limitations.
This of course requires understanding the interdependence of the various processes involved and their impact on performance.

This talk will present a CU’s queueing model with walking type server (or vacations).
The analysis yields explicit expressions for the first two moments of CU packet service time, and provides an optimization of throughput,
delay, interference, energy efficiency and QoS.
As opposed to the conventional server vacations that prolong the effective service time, here they reduce the service and response time.
Additionally, modifications to CU’s service process are introduced,
showing the ability of CU to improve its performance by adapting to the communications environment.

Bio:
Dr. Boris Oklander received B.Sc., M.Sc. and Ph.D. degrees in Electrical Engineering from the Technion in 2001, 2008 and 2013 respectively.
Currently he is a research associate at Intelligent Systems and Networks Group, Imperial College London.
His research interests are in modeling and performance evaluation of communications networks.

20/11/2013EE: Meyer 861

Scaling Data Center Routers.

Alex Shpiner

Electrical Engineering, Technion

Data center networks form one of the most challenging network environments for routers, due to their fast link rates, short propagation times,
and high performance demands. In this talk, we analyze two router functions related to packet processing: address resolution and order preservation.

The first part of the talk deals with VM (virtual machine) address resolution in data centers. Data centers can run multiple VMs,
and potentially place them on any of the servers. Therefore, a VM address resolution method that determines the server location of any
VM needs to be provided for inter-VM communication. Unfortunately, existing methods suffer from a scalability bottleneck in the
network load of the address resolution messages and/or in the size of the address resolution tables.
We propose Smart Address Learning (SAL), a novel approach that selectively learns VM addresses for specific tenants,
and solves this scalability bottleneck.

In the second part of the talk, we introduce novel scalable scheduling algorithms for preserving flow order in parallel multi-core network processors.
The development of new processing features in advanced network processors has resulted in increasingly parallel architectures and
increasingly heterogeneous packet processing times. This may lead to large packet reordering delays. Our suggested algorithms can significantly
reduce reordering delay, while adapting to any load-balancing algorithm and keeping a low implementation complexity overhead.

Ph.D. Seminar. Advisor: Prof. Isaac Keslassy.

Bio:
Alex Shpiner is a Ph.D. candidate in the Electrical Engineering department of the Technion, under the supervision of Prof. Isaac Keslassy.
Alex received his B.Sc. degree in computer engineering from the Electrical Engineering department of the Technion in 2004.
His main research interests are congestion control in computer networks, switch scheduling, network processors, and data center networks.

13/11/2013CS: Taub 401

From L3 to seL4: What Have We Learnt in 20 Years of L4 Microkernels?

Prof. Gernot Heiser (Henry Taub Distinguished Visitor)

University of New South Wales

The L4 microkernel has undergone 20 years of use and evolution. It has an active user and
developer community, and there are commercial versions which are deployed on a large scale and in safety-critical systems.
In this paper we examine the lessons learnt in those 20 years about microkernel design and implementation.
We revisit the L4 design papers, and examine the evolution of design and implementation from the original L4 to the latest generation of L4 kernels,
especially seL4, which has pushed the L4 model furthest and was the first OS kernel to undergo a complete formal verification of its implementation
as well as a sound analysis of worst-case execution times. We demonstrate that while much has changed,
the fundamental principles of minimality and high IPC performance remain the main drivers of design and implementation decisions.

Bio:
Gernot Heiser is Scientia Professor and John Lions Chair of Operating
Systems at the University of New South Wales (UNSW), and leads the Software
Systems Research Group at NICTA, Australia’s National Centre of Excellence
for ICT Research. He joined NICTA at its creation in 2002, and before that
was a full-time member of academic staff at UNSW from 1991. His past work
included the Mungi single-address-space operating system (OS), several
un-broken records in IPC performance, and the best-ever reported performance
for user-level device drivers. More recently he led the team at NICTA which
produced the world’s first formal proof of functional correctness of a
protected general-purpose operating-system kernel, the first published sound
worst-case execution time analysis of a protected OS kernel, and the
synthesis of high-performance device drivers.

30/10/2013CS: Taub 701

(In-)Security of Smartphones

Ari Trachtenberg (TCE Guest Talk)

Boston University

This talk will cover some of the prominent attack surfaces at all abstraction layers of modern smartphones,
including those based on bypassing application signatures, USB takeover, GPS updates, commandeering the GSM subsystem,
and filtering data from SSD memory.
In the second part of the talk, we will present some of our own research into possible defenses against these or other attacks.

Bio:
Ari Trachtenberg is a Professor of Electrical and Computer Engineering at Boston University.
He received his PhD and MS in Computer Science from the University of Illinois at Urbana/Champaign, and his SB from MIT in Math/CS.
His research interests include cyber security (smartphones, offensive and defensive), networking (security, sensors, localization);
algorithms (data synchronization, file edits, file sharing), and error-correcting codes (rate less coding, feedback).

23/10/2013CS: Taub 337

Transactional Memory Specification: Theory and Practice

Victor Luchangco

Oracle Labs

Transactional memory is a promising mechanism for synchronizing concurrent programs.
There has been a lot of research in the past decade in implementing transactional memory.
However, relatively little attention has been paid to precisely specifying what it means for them to be correct,
which is necessary to reason rigorously about transactional memory. I will discuss some of the issues with such specifications,
and why there is not a single “correct” specification.

I will present several specifications, including formal specifications that we have used to formally verify transactional memory algorithms.
I will also describe work on a proposal for adding transactional memory support to C++,
and discuss some of the pragmatic issues that arise in developing such a proposal.

Bio:
Victor Luchangco is a Principal Member of Technical Staff in the Scalable Synchronization and Programming Languages Research Groups of Oracle Labs.
His research focuses on developing algorithms and mechanisms to support concurrent programming on large-scale distributed systems.
A theoretician by disposition and training, Victor is also interested in practical aspects of computing.
In particular, he would like to design mechanisms that practical for real computers.
In addition, he is interested in exploring how to make proofs for concurrent systems easier, both by changing how people design these systems
and by using tools to aid in formal verification.
Victor received an Sc.D. in Computer Science from MIT in 2001, with a dissertation on models for weakly consistent memories.

2/10/2013CS: Taub 337

Fast Decisions in High-Speed Networking Devices

Yossi Kanizo

Computer Science, Technion

Tables that match a given key to its value (e.g., hash tables) have become crucial algorithmic building blocks for contemporary
networking devices that need to take decisions on a large amount of data at high speed. Unlike traditional table-based data structures,
the networking environment provides only very limited memory and necessitates a strict worst-case operation time.

Furthermore, since such tables typically lie on the critical path of networking devices,
the total number of memory accesses performed by these tables dramatically affects overall performance:
A large number of memory accesses may increase the power consumption, and in some cases, decrease the throughput.

In this talk, we consider tables based on multiple-choice hashing schemes.
Such schemes are particularly suited to networking devices because they guarantee worst-case operation time,
albeit with some percentage of overflowed elements.
We introduce optimal online multiple choice hashing schemes, when the data structure should either support element deletions, or not.
In particular, given a memory accesses per packet,
the fraction of overflowed elements under the optimal scheme that does not support deletions is Omega(e^(-a)).
This overflow fraction increases to Omega(1/a), when deletion support is required.
We have also studied how to cope with the stringent memory requirements of networking devices.
In particular, we introduce a scheme that allows splitting a large table across the entire network,
while maintaining its overall semantics.

Ph.D. Seminar.
Advisor: Prof. Isaac Keslassy and Prof. David Hay (Hebrew Univ.).

Bio:
Yossi Kanizo is a Ph.D. candidate in the Computer Science department of the Technion,
under the supervision of Isaac Keslassy and David Hay (Hebrew Univ.).
Yossi has received his B.Sc. degree in computer engineering from the Computer Science department of the Technion in 2006.
His main research interests are computer networks, hash-based data-structures, and switch architectures.

3/7/2013EE: Meyer 861

Spanner: Google’s Globally-Distributed Database

Dr. Brian Cooper

Google

Spanner is Google’s scalable, multi-version, globally-distributed, and synchronously-replicated database.
It provides strong transactional semantics, consistent replication, and high performance reads and writes for a variety of Google’s applications.
I’ll discuss the design and implementation of Spanner, as well as some of the lessons we have learned along the way.
I’ll also discuss some open challenges that we still see in building scalable distributed storage systems.

Bio:
Brian F. Cooper is a software engineer on the storage infrastructure team at Google.
He works on the Spanner distributed database system.
Previously, he has worked on Google web search, and while at Yahoo! Research, the PNUTS database system and YCSB benchmark.
He received his Ph.D. from Stanford University in 2003.

26/6/2013EE: Meyer 1007

On the Steady-State of Cache Networks

Dr. Elisha Rosensweig

UMass Amherst/Alcatel-Lucent

Over the past few years Content-Centric Networking, a networking model in which host-to-content communication protocols are introduced,
has been gaining much attention. A central component of such an architecture is a large-scale
interconnected caching system. To date, the way these Cache Networks operate and perform is still poorly understood.
In this work, we demonstrate that certain cache networks are non-ergodic in that their steady-state characterization depends on
the initial state of the system.
We then establish several important properties of cache networks, in the form of three independently suf?cient conditions for a
cache network to comprise a single ergodic component.
Each property targets a different aspect of the system – topology, admission control and cache replacement
policies. Perhaps most importantly we demonstrate that cache replacement can be grouped into equivalence classes,
such that the ergodicity (or lack-thereof) of one policy implies the same property holds for all policies in the class.

BEST PAPER AWARD, INFOCOM 2013

Bio:
Elisha graduated in September 2012 for his PhD of Computer Science from UMass Amherst,
under the advisorship of Prof. Jim Kurose.
His thesis topic was “On the Analysis and Management of Cache Networks”.
Currently he is working at Alcatel-Lucent’s CloudBand, a platform for managing Distributed Cloud and
supporting Network Function Virtualization (NFV).

24/6/2013CS: Taub 9

Visualization Tools to Analyze Multi-threaded Program Scalability and Performance

Dr. Jennifer Sartor

Ghent University

Analyzing multi-threaded program scalability and performance on modern multicore machines is challenging,
but paramount to continue optimizing software and to use hardware resources efficiently.
Synchronization between threads results in some threads running while others wait on locks and barriers.
We create a new metric, the criticality metric, to judge each thread’s contribution to program execution time based on synchronization.
The criticality metric takes into account how much time a thread is running and how many co-running threads exist.
We use this metric to create two new multi-threaded program performance visualization tools.
First, we propose small hardware extensions to measure our criticality metric with low overhead while a program is running,
using these statistics to create criticality stacks that show each thread’s criticality, revealing parallel imbalances.
I will then detail how our ISCA paper shows how we applied this program analysis tool to optimize software,
improve performance using frequency scaling in hardware, and save energy.

I will then give a preview of our new work (to be presented at OOPSLA) that measures this metric entirely in software,
piggy-backing on the operating system. We create our second powerful multi-threaded analysis tool, bottle graphs,
which present a box for each thread showing its average parallelism (width), and its share of total program execution time (height).
We stack each thread’s box, that has area equal to the thread’s total running time, on top of each other,
leading to a bottle graph with height equal to the total program execution time.
The graph points out the most fruitful avenues for optimization,
leading software developers to the threads with low parallelism (narrow) and a large share of the total execution time (tall) –
i.e. to the neck of the bottle graph.
I will show the bottle graphs of several Java benchmarks running on top of Jikes RVM on real hardware,
which reveal scalability bottlenecks of both the benchmarks and the JVM service threads.

Bio:
Jennifer Sartor is a post-doctoral researcher at Ghent University in Belgium in the Electronics and Information Systems department.
Her research focuses on analyzing and optimizing the performance of managed language applications on modern computer systems.
In particular, she optimizes memory efficiency, particularly memory layout with an automatic memory manager,
also cooperating to obtain good cache behavior.
She is particularly interested in software-hardware co-design where the managed runtime can send hints about program behavior down to
hardware to obtain faster execution time or faster memory access.
She obtained her PhD in Computer Science from The University of Texas at Austin in 2010.

12/6/2013EE: Meyer 861

Multi-core, Mega-nonsense

Prof. Yale Patt (Henry Taub Distinguished Visitor)

University of Texas at Austin

Multicore has been around for several years now, and we hear it touted
as the panacea of everything. …until recently, that is. As expected,
the hype has generated a lot of nonsense. We are told that multicore came
about as a solution to a performance problem, that multicore allows you to
run your problems at half the frequency and save power, that ILP is dead,
that Moore’s Law means we can put thousands (perhaps millions?) of cores
on a single silicon die, that hardware works sequentially, and that
abstraction is a pure good. Most recently, the term dark silicon has been
coined as one of the bad consequences of the continual viability of Moore’s
Law. In this talk, I propose to examine some of the nonsense, and in
particular, see if some of these “bugs” can be turned into “features.”

Bio:
Yale Patt is a teacher at the local public university in Austin, Texas. He has
enjoyed almost 50 years (so far) in computing: teaching, doing research, and
consulting. Some of his research (e.g., HPS, branch prediction) have found
their way into successful microprocessors. His unconventional but CORRECT
approach to introducing serious students to computing has found its way into
the curriculum of more than 100 universities worldwide and a breakaway textbook,”Intro to Computing, from bits and gates to C and beyond.” He earned the
obligatory degrees from reputable universities and has received more than
enough awards for his research and teaching. More detail can be found on his
web site: www.ece.utexas.edu/~patt.

5/6/2013CS: Taub 337

Spinal Codes

Jonathan Perry

MIT, Computer Science and Artificial Intelligence Lab.

Handling noise and interference in wireless networks requires adaptive,
high-performance error
correction. Spinal codes are a new rateless error correcting code that
iteratively applies a
hash function to message bits, ensuring that two input messages that
differ in even one bit
produce very different coded sequences after the point at which they
differ. Spinal codes offer
a flexible tradeoff between computational cost and performance. Because
spinal codes are
rateless, they automatically adapt to changing channel conditions.

The resulting system achieves better throughput than LDPC and Raptor
codes, and despite the large
state space induced by the hash output, the message can be recovered
efficiently; a preliminary
hardware prototype decodes at 10Mbps.

No prior knowledge of coding theory is required.

More information at http://nms.csail.mit.edu/spinal/

Bio:
Jonathan Perry received a B.Sc in CS from Tel-Aviv University in 2003.
Jonathan worked in
high performance computing and in distributed systems until 2010, when
he joined MIT’s Ph.D.
program. Co-advised by Hari Balakrishnan and Devavrat Shah, Jonathan is
currently working on
error correcting codes and efficient network transport.

22/5/2013CS: Taub 337

Secure Logical Isolation for Multi-tenancy in Cloud Storage

Dr. Hillel Kolodner

Systems Technologies department, IBM Haifa Research Lab.

Storage cloud systems achieve economies of scale by serving multiple
tenants from a shared pool of servers and disks. This leads to the commingling of data from different tenants on the
same devices. Typically, a request is processed by an application running with sufficient privileges to access any tenant’s data;
this application
authenticates the user and authorizes the request prior to carrying it out.
Since the only protection is at the application level, a single
vulnerability threatens the data of all tenants, and could lead to
cross-tenant data leakage, making the cloud much less secure than dedicated
physical resources.

To provide security close to physical isolation while allowing complete
resource pooling, we propose Secure Logical Isolation for Multi-tenancy
(SLIM). SLIM incorporates the first complete security model and set of
principles for the safe logical isolation between tenant resources in a
cloud storage system, as well as a set of mechanisms for implementing the
model. These principles lead to the potentially costly conclusion that each
request should be handled by a new process. We present a detailed design,
implementation and performance analysis of a process factory to greatly
reduce the cost while still preserving secure isolation. Finally, we show
how to implement SLIM for OpenStack Swift and present performance results,
showing SLIM with our optimizations provides an order of magnitude
improvement over a naive implementation of process isolation.

Authors: Michael Factor, David Hadas, Aner Hamama, Nadav Har’el, Hillel
Kolodner, Anil Kurmus, Eran Rom, Alexandra Shulman-Peleg and Alessandro
Sorniotti.

Bio:
Hillel Kolodner is a Senior Technical Staff Member in the Systems
Technologies department at the IBM Haifa Research Lab. In the past he has
worked on the implementation of Java for multiprocessor servers, especially
on automatic memory management (garbage collection). Recently, he has
worked on virtualization and management technologies for cloud computing.
Currently, he is working on cloud object stores and is the PI for VISION
Cloud , an European Commission FP7 Integrated Project developing storage
cloud technologies. Hillel holds a Ph.D. and M.S. in Computer Science from
the Massachusetts Institute of Technology, and a B.A. in Mathematics and a
B.S.E. in Computer Science from the University of Pennsylvania.

8/5/2013EE: Meyer 861

Non-Volatile Memory Enhancement: a Cross-Layer Approach

Amit Berman

Electrical Engineering Department, Technion

The invention of semiconductor technology has marked a new era in memory devices: SRAM, DRAM, Flash and more.
The ever-increasing rate of data production and consumption stimulates the development of high-performance memory devices.
At times, high-density scaling drives new applications and ways of operation.
However, device advancement presents trade-offs and ever growing challenges.
High-performance memory (SRAM, DRAM) suffers from relatively low density, higher power consumption and, most important, data volatility.
Similarly, high-density, non-volatile memory (Flash) exhibits relatively low performance.
As device technology shrinks, massive inter-cell interference is limiting the achievable density,
high variations among memory cells result in degraded read/write performance, and endurance is limited due to cell degradation.
Over the past decade, the various challenges have been the subject of much research,
mostly focused on technology- and circuit-level innovation. Other challenges (e.g. restricted overwrite due to one-way charge level changes)
were addressed by coding techniques.

In our research, we explore cross-layer methods for enhancing memory characteristics (density, read/write performance, power and
reliability).
Specifically, inter-cell interference is mitigated by using constrained coding to prevent the data patterns that cause interference
beyond a predefined limit;
read performance is improved by way of a speculative early sensing mechanism,
whereby cell read time is dynamically minimized through premature sensing along with guaranteed error detection;
multi-level cell write speed is improved by minimal maximum-level programming, whereby cells are being written gradually,
different same-size pages are stored in different numbers of cells, and bit fractions of any given page are stored in a cell;
and the number of possible rewrites between block erasures is increased via page management that permits writing to retired pages when
proper data is available.
Our schemes are presented and evaluated mostly in the context of NAND Flash, with several extensions to DRAM and SRAM,
but some are also applicable to emerging memory technologies such as PCM.

Ph.D. Seminar. Advisor: Prof. Yitzhak Birk.

Bio:
Amit Berman is a Ph.D candidate at the Electrical Engineering Department, Technion – Israel Institute of Technology.
He received the B.Sc and M.Sc in Electrical Engineering at 2006 and 2010 respectively. He is a recipient of several awards,
including Hershel Rich Technion Innovation award, Intel research award, and HPI fellowship.
He held engineering and management positions at Intel and Saifun (acquired by Spansion) during 2003-2009.
He authored several international publications in the field of non-volatile memory and computer architecture and holds several pending US patents.

1/5/2013EE: Meyer 1061

Optimizing Traffic Engineering on the Internet

Dr. Michael Schapira

School of Computer Science and Engineering, the Hebrew
University of Jerusalem

New types of applications and services (from video streaming to
cloud computing) are placing tremendous demands on today’s
networks. To tackle this challenge, network operators do
traffic engineering (TE), i.e., tune routing-protocol
parameters so as to use network resources efficiently. I
will present recent (algorithmic and experimental) results
regarding today’s prevalent TE technique (ECMP routing) and
also novel frameworks for TE (e.g., for TE with migration).
I will discuss interesting implications of these results (for
data center networks, ISP networks, and more).

Joint work with Eric Keller and Jennifer Rexford (2012),
with Avinatan Hassidim and Haim Kaplan (2013), and with
Marco Chiesa (2013)

Bio:
Dr. Michael Schapira is a Senior Lecturer (Assistant
Professor) at the School of Computer Science and Engineering,
the Hebrew University of Jerusalem.

His research draws ideas from theory (e.g., from algorithmics,
distributed computing, and economics) to design and analyze
(Inter)networking protocols (e.g., for routing, congestion
control, and traffic engineering). Prior to joining the Hebrew
University, Dr. Schapira worked at Google NYC, as part of the
Infrastructure Networking Group. He was also a postdoctoral
researcher at UC Berkeley, Yale University and Princeton
University. Dr. Schapira holds a B.Sc. in Mathematics and
Computer Science, a B.A. in Humanities, and a Ph.D. in Computer
Science, all from the Hebrew University (awarded in 2004, 2004,
and 2008, respectively). He is the recipient of the Allon
Fellowship for Outstanding Young Researchers (2011).

24/4/2013CS: Taub 337

To Zip or not to Zip: Effective Resource Usage for Real-Time Compression

Dr. Danny Harnik

Storage Systems Group, IBM Haifa Lab (Tel Aviv)

Real-time compression for primary storage is quickly becoming
widespread as data continues to grow exponentially, but adding
compression on the data path consumes scarce CPU and memory
resources on the storage system. In this talk I’ll present
different approaches to efficient estimation of the potential
compression ratio of data and how these methods can be applied
in advanced storage systems. Our work targets two granularities:
the macro scale estimation which is backed up by analytical
proofs of accuracy, and the micro scale which is heuristic.

Based on joint work with Oded Margalit, Ronen Kat, Dmitry
Sotnikov and Avishay Traeger, from IBM Research – Haifa. This
work has recently appeared in FAST 2013.

Bio:
Danny is a researcher at the storage systems group at IBM Haifa
Research Labs. He completed his PhD at the Weizmann Institute
(2006) and before joining IBM spent two years as a Postdoc at
the Technion and UCLA/IPAM. His main research interests are
storage systems, data compression, security and cryptography.

17/4/2013CS: Taub 337

RFID Security is Good for the Grocer

Yossi Oren

Cryptography and Network Security Lab, Tel-Aviv University

Radio-Frequency Identification (RFID) tags based on the
Electronic Product Code (EPC) standard have been aggressively
introduced into the global supply chain. These tags were
designed as an upgrade to the familiar 14-digit Universal
Product Code (UPC) barcode. Due to the low cost and limited
energy budget available to these tags, it was traditionally
considered impractical to apply any sort of cryptographic
protection to these tags.

In this talk I will show how, contrary to previous claims
of impracticality, EPC tags are capable of performing
full-strength public-key encryption. The crucial element in
our system is the WIPR encryption scheme, which enjoys a
remarkably low resource footprint. We show how to use WIPR
to provide high-security anti-counterfeiting functionality
which respects user privacy, and show how WIPR tags and
standard EPC tags can coexist and share the same infrastructure.

Joint work with Alex Arbit and Avishai Wool.

Bio:
Yossi Oren is a Ph.D. student at the Computer Network and
Security Lab at Tel-Aviv University. His research interests are:

  • Secure Hardware: Power analysis and other hardware attacks
    and countermeasures on cryptographic devices; Low-resource
    cryptographic constructions for lightweight computers such as
    RFID tags
  • Cryptography in the real world: Consumer and voter privacy
    in the digital era; Web application security

10/4/2013EE: Meyer 861

Towards Building a Practical Privacy-Preserving Recommender System

Dr. Udi Weinsberg

Technicolor Research, Palo Alto

Many online services, such as recommender systems, email, and social networks
collect user data, which is then used for both personalization and monetization.
Although the latter enables services to be free, users are realizing that these services come at
a hidden cost of potentially exposing their private data.

In this talk I will show that even the common 5-star item-rating recommender system leaks private demographic information.
Then, I will discuss methods for helping users preserve their privacy while getting accurate recommendations.
Finally, a building block of many recommender systems, and an important machine-learning algorithm on its own, is linear regression.
I will present a system that learns a linear model without learning anything about the private input data other than the output model.

Bio:
Udi Weinsberg is a researcher and associate fellow at Technicolor Research in Palo Alto, CA.
Udi received his PhD (2011) and M.Sc (2007) from the school of electrical engineering at Tel-Aviv University, Israel,
and his B.Sc (2001) from the Technion, Haifa, Israel.
His research focuses on the intersection of user profiling, privacy, and recommender systems.
In addition, he studies large-scale networks, content placement, and the topology of the Internet.

3/4/2013EE: Meyer 861

Exploit Mitigation: From Detection to Obstruction

Dr. Gal Badishi

Cyvera Ltd., Cyber Defense Solutions

Recent attacks on high-value targets, demonstrate how state-of-the-art
defenses fail to protect against APTs (Advanced Persistent Threats).
These victims spare no expense and appropriately deploy cutting-edge
defenses, such as firewalls, intrusion detection and prevention systems,
and anti-virus scanners, as well as novel approaches for detecting
zero-day exploits – yet these are ineffective at thwarting determined
attackers.

In this talk we examine the unsatisfying state of attack-prevention
solutions, as well as demonstrate the ease of circumventing the
majority of defenses.

We move on to present a fresh security paradigm: extensive obstruction
of attacks, rather than an attempt to identify and detect malicious
behaviors and attack-related actions, often after the fact. In combining
methods such as traps in heap memory and DLL protection, with
enhancements to solutions such as Data Execution Prevention (DEP) and Address
Space Layout Randomization (ASLR), we achieve nearly perfect
exploit-prevention rates, even for zero-day exploits.

Further, we’ll discuss the challenges in transforming these mitigation
techniques into a commercial-grade product with security modules that
can be applied generically to every process.

Bio:
Gal Badishi is the Chief Scientist of Cyvera, a VC-backed startup
providing innovative cyber-defense solutions. Gal is a hands-on security
researcher, specializing in software vulnerabilities, exploitation
techniques, and exploit-mitigation. He received his B.Sc in Computer
Science from the Hebrew University in Jerusalem in 2000, and his Ph.D.
from the Department of Electrical Engineering at the Technion, Israel’s
Institute of Technology, in 2007. Gal has contributed to the Israeli Cyber
Intelligence community and acted as a consultant to the IDF’s Cyber
Headquarters.

13/2/2013EE: Meyer 861

Reliablity and Efficiency in Cloud Storage

Ittay Eyal

Electrical Engineering Department, Technion

Advances in data center technologies have led to extensive use of data centers
to store large volumes of data in a managed distributed system. The users of
such systems have increasing expectations of both efficiency and reliability.

Recent years have shown that even the largest cloud storage providers occasionally
fail, and users have to replicate data among multiple providers to obtain
reliability. However, classical replication techniques (e.g., ABD) are not
applicable here, since storage services typically export only a key-value store
(KVS) interface, with functions for storing and retrieving values associated with
unique keys. In the first part of this talk, we present the first efficient,
wait-free algorithm that emulates multi-reader, multi-writer reliable storage from
a set of potentially faulty KVS replicas in an asynchronous environment. We
implemented the algorithm and tested it using real-world providers, and we
demonstrate through simulation its low overhead in terms of space and speed in
various scenarios.

In the second part of the talk we introduce ACID-Rain: ACID transactions in a
Resilient Archive with Independent Nodes. ACID-Rain is a novel architecture for
efficiently implementing transactions in a distributed data store – a sought-after
but difficult to achieve capability. ACID-Rain uses logs in a novel way, limiting
reliability to a single tier of the system. A set of independent nodes form an outer
layer that caches the data, backed by a set of independent reliable log services.
ACID-Rain avoids collisions between transactions by using prediction to order the
transactions before they take actions that would lead to an abort. ACID-Rain is
efficient in overcoming failures, and its throughput increases linearly with the
number of nodes.

Ph.D. Seminar, under the supervision of Prof. Raphi Rom and Prof. Idit Keidar.

Bio:
Ittay Eyal is a Ph.D. candidate in the department of Electrical Engineering under
the supervision of Prof. Raphi Rom and Prof. Idit Keidar. Ittay Received his B.Sc.
from the Technion in 2007, and did internships with IBM, Yahoo! and Cornell university.
His main research interests are distributed storage and reliability in large scale
systems, and robust aggregation in sensor networks.

23/1/2013CS: Taub 3

New Bounds for Renaming and WSB

Dr. Armando Castaneda

Computer Science Department, Technion

In a distributed task, processes of a distributed system start with
private input values, communicate each other using a medium and then
irrevocably decide an output value.
In the M-renaming task each process of the system starts with a
distinct input name.
and must decide a distinct output name in the range [1, …, M].
The renaming task is at the core of the theory of distributed computing.

Several papers claimed that, in an asynchronous distributed system
in which processes communicate through a shared memory,
M = 2n-1 is a tight lower bound for renaming, namely,
there is an algorithm that solves M-renaming if and only if M >= 2n-1,
where n denotes the number of processes.

In this talk we will see that the renaming lower bound above is incomplete.
There are values of n for which indeed M = 2n-1 is a tight lower bound for.
However, for the other (infinitely many) values of n, there exists
an algorithm that solves M-renaming with M = 2n-2.
The solvability of (2n-2)-renaming is closely related with the existence of
topological objects in some dimensions.

Bio:
Armando Castaneda got his PhD from the National Aunonomous University of Mexico
under the supervision of Prof. Sergio Rajsbaum.
He joined the Department of Computer Science of the Technion in July of 2012.
His research interest lies in the theory of distributed systems, particularly,
in understanding the principles of distributed computing.

16/1/2013EE: Meyer 861

Quantitative Analysis of the Full Bitcoin Transaction Graph

Dr. Dorit Ron

Department of Computer Science and Applied Mathematics, Weizmann Institute of Science

The Bitcoin scheme is a rare example of a large scale global payment
system in which all the transactions are publicly accessible (but in
an anonymous way). We downloaded the full history of this scheme, and
analyzed many statistical properties of its associated transaction graph.
In this paper we answer for the first time a variety of interesting
questions about the typical behavior of users, how they acquire and how
they spend their bitcoins, the balance of bitcoins they keep in their
accounts, and how they move bitcoins between their various accounts in
order to better protect their privacy. In addition, we isolated all the
large transactions in the system, and discovered that almost all of them
are closely related to a single large transaction that took place in
November 2010, even though the associated users apparently tried to hide
this fact with many strange looking long chains and fork-merge structures
in the transaction graph.

Joint work with Prof. Adi Shamir

Bio:
Dorit Ron is a staff scientist in the Weizmann Institute of Science since
2003. As of 1991 she was a postdoc and a reaserch fellow with Prof. Achi
Brandt . She received her Ph.D. and M.Sc from the Weizmann Institute and
her B.Sc. from the Technion. She was a researcher for a year in the
University of Colorado and for one year a professor in the departmant of
mathematics in the University of California at Berkeley.

Her research interest is in developing linear time algorithms mostly based
on multilevel techniques for scientific computations in varuious areas:
statistical physics and Monte Carlo simulations, hard optimization problems
of either discrete or continous nature, graph problems and robotics.

9/1/2013CS: Taub 337

An Intent-based Approach for Network Virtualization

Dr. Rami Cohen

IBM Research Lab in Haifa

Virtualizing resources for easy pooling and accounting, as well as for rapid
provisioning and release, is essential for the effective management of modern
data centers. Although the compute and storage resources can be virtualized
quite effectively, a comprehensive solution for network virtualization has yet
to be developed. Our analysis of the requirements for a comprehensive network
virtualization solution identified two complimentary steps of ultimate
importance. One is specifying the network-related requirements, another is
carrying out the requirements of multiple independent tenants in an efficient
and scalable manner.

We introduce a novel intent-based modeling abstraction for specifying the
network as a policy governed service and present an efficient network
virtualization architecture, Distributed Overlay Virtual nEtwork (DOVE),
realizing the proposed abstraction. We describe the working prototype of DOVE
architecture and report the results of the extensive simulation-based
performance study, demonstrating the scalability and the efficiency of the
solution.

Bio:
Dr. Rami Cohen received the Ph.D. and M.Sc. from the Technion, Israel Institute
of Technology in 2007 and 2005 respectively. With a vast experience in leading
network and system architectures in several major companies, he is currently a
research staff member in IBM Research Lab in Haifa. His main research interests
include network virtualization, SDN, network architectures, cloud computing, and
optimization algorithms of network related problems.

2/1/2013CS: Taub 337

The SkipTrie: Low-Depth Concurrent Search without Rebalancing

Dr. Rotem Oshman

Computer Science Department, University of Toronto

To date, all concurrent search structures that can support predecessor
queries have had depth logarithmic in m, the number of elements. This
work introduces the SkipTrie, a new concurrent search structure supporting
predecessor queries in amortized expected O(log log u + c) steps, insertions
and deletions in O( c log log u ), and using O(m) space, where u is the size
of the key space and c is the maximum point contention during the operation.

The SkipTrie is a probabilistically-balanced version of a y-fast trie
consisting of a very shallow skiplist from which randomly chosen elements
are inserted into a hash-table based x-fast trie. By inserting keys into
the x-fast-trie probabilistically, we eliminate the need for rebalancing,
and can provide a lock-free linearizable implementation.

Bio:
Rotem Oshman is a post-doctoral fellow in the Computer Science Department
at the University of Toronto. She received her PhD in Computer Science from
MIT in 2012, and her MSc and BA in Computer Science from the Technion. Her
research interests include distributed computing, communication complexity
and formal methods, with emphasis on where these fields intersect and inform
each other.

26/12/2012EE: Meyer 1061

Energy-aware scheduling for heterogeneous multi-core architectures

Prof. Daniel Mosse

CS, University of Pittsburgh

The current trend to move from homogeneous to heterogeneous multi-core
(HetCMP) systems promises further performance and energy-efficiency
benefits. A typical HetCMP system includes two distinct types of cores,
such as high performance sophisticated (“large”) cores and simple
low-power (“small”) cores. In those heterogeneous platforms, execution
phases of application threads that are CPU-intensive can take best
advantage of large cores, whereas I/O or memory intensive execution
phases are best suited and assigned to small cores. However, it is
crucial that the assignment of threads to cores satisfy both the
computational and memory bandwidth constraints of the threads.

In this talk we will present current work on scheduling and allocation
of threads in HetCMP, with soft-real-time and no real-time constraints
as well as some open problems in the field.

Bio:
Daniel Mosse is Professor and Chair of the Computer Science Department at the University
of Pittsburgh. The current major thrusts of his research are real-time and embedded systems,
power management issues, and networks (wireless and security), bridging the gap between
the operating systems and networking research fields. He has published approximately 200
papers worldwide in these topics. Typically funded by NSF and DARPA, his projects combine
theoretical results and actual implementations. He received a BS in Mathematics from the
University of Brasilia in 1986, and MS and PhD degrees in Computer Science from the
University of Maryland in 1990 and 1993, respectively. Dr. Mosse received the Provost’s
Innovation in Education Grant/Award in 2007 for redesigning the Introductory Programming
course for non-majors. He received the Tina and David Bellet Teaching Excellence Award in
2006 (One of two among over 500 faculty members in the School of Arts and Sciences). Dr.
Mosse has served on PCs and as PC chair for most major IEEE- and ACM-sponsored real-time
conferences. In 2011, he was co-chair of the International Green Computing Conference and
General Co-Chair for the International conference on Embedded and Real-Time Computing
Systems.

19/12/2012EE: Meyer 861

Predicting Execution Bottlenecks in Map-Reduce Clusters

Dr. Edward Bortnikov

Yahoo! Labs Israel

Extremely slow, or straggler, tasks are a major performance bottleneck
in map-reduce systems. Hadoop infrastructure makes an effort to both
avoid them (through minimizing remote data accesses) and handle them in
the runtime (through speculative execution). However, the mechanisms in
place neither guarantee the avoidance of performance hotspots in task
scheduling, nor provide any easy way to tune the timely detection of
stragglers.

We suggest a machine-learning approach to address these problems, and
introduce a slowdown predictor – an oracle to forecast how much slower
a task will run on a given node, compared to similar tasks. Slowdown
predictors can be embedded in the map-reduce infrastructure to improve
the agility and timeliness of scheduling decisions. We provide initial
evaluation to demonstrate the viability of our approach, and discuss
the use cases for the new paradigm.

Bio:
Edward Bortnikov is a Principal Research Engineer at Yahoo! Labs.
His interests broadly span large scale distributed systems, search,
big data analytics, and networking technologies. He published 20+
scientific papers in these areas. He received his PhD in Electrical
Engineering from the Technion – Israel Institute of Technology in 2008.
Prior to that, he worked in technical leadership positions at IBM
Research, Mellanox Technologies, SANgate Systems, and HP Labs.

28/11/2012EE: Meyer 861

Optimal Rebuilding in Distributed Storage Systems

Zhiying Wang

Caltech

Distributed storage systems have become a popular solution to large
file storage and fast data access. In such systems, erasure-correcting
codes are widely used to combat disk failures, where disks correspond
to symbols in the code. Specifically, we study MDS (maximum distance
separable) array codes which enable optimal storage and efficient
encoding and decoding algorithms. With r redundancy symbols an MDS
code can sustain r erasures. For example, consider an MDS code that
can correct two erasures. It is clear that when two symbols are
erased, one needs to access and transmit all the remaining information
to rebuild the erasures. However, an interesting and practical
question is: What is the smallest fraction of information that one
needs to access and transmit in order to correct a single erasure? In
this talk we will show that the lower bound of 1/2 is achievable and
that the result can be generalized to optimally rebuild an arbitrary
number of parities.

Bio:
Zhiying Wang received the B.Sc. degree in Information Electronics and
Engineering from Tsinghua University, Beijing, China, in 2007 and M.
Sc. degree in Electrical Engineering from California Institute of
Technology, Pasadena, USA, in 2009. She is now a Ph.D. candidate with
the department of Electrical Engineering, California Institute of
Technology. Her research focuses on information theory, coding theory,
with an emphasis on coding for storage devices and systems.

21/11/2012CS: Taub 337

CBTree: A Practical Concurrent Self-Adjusting Search Tree

Adam Morrison

School of Computer Science, Tel Aviv University

We present the CBTree, a new counting-based self-adjusting sequential search
tree that, like splay trees, moves more frequently accessed nodes closer to
the root: After M operations on N items, Q of which access some item V, an
operation on V traverses a path of length O(log M/Q) while performing few if
any rotations.

In contrast to the traditional self-adjusting splay tree in which each accessed
item is moved to the root through a sequence of tree rotations, the CBTree
performs rotations infrequently (an amortized subconstant o(1) per operation
if M >> N) mostly at the bottom of the tree. Therefore, CBTree gracefully
supports concurrency.

Joint work with: Yehuda Afek, Haim Kaplan, Boris Korenfeld, Robert E. Tarjan

Bio:
Adam Morrison is a Computer Science PhD student at Tel Aviv University, under
the supervision of Prof. Yehuda Afek. He works on fast and scalable algorithms
for multicore systems. He received his MSc and BSc in Computer Science Cum Laude
from Tel Aviv University. He is the recipient an IBM PhD Fellowship.

7/11/2012CS: Taub 337

The Future of the Telecommunication Industry: The New Horizon of Virtual
Telecommunication and Carrier Clouds

Dr. David Amzallag

Vice President and CTO, Virtual Telecommunication and Cloud Solutions at
Alcatel-Lucent

Wireless networks are the most important assets in our industry, yet
the problem is that we are building new networks the same way we built our
previous networks. Capacity is planned in advance per service; to handle
peek traffic and to add new capacity takes time and investment of dedicated
equipment. We have no elasticity as demand grows, on average the service
is grossly underutilized though also over utilized at specific times of day or
geographic locations, and we can’t share capacity across service domains.
Engineering and maintaining this network complexity is time consuming, costly,
and typically reactionary. To add to these complexities, throw in the need
to manage Wi-Fi networks and the growth of adding metro/small cells and you
start to realize that delivering a compelling customer and network experience
is in fact impossible with the way we build today’s networks. Add to that, the
pressure for profitability and we have the most challenging situation the
Telecommunication industry has been faced with in a long time.

In this talk an innovative re-thinking of the network will be
introduced, mobile as well as fixed, that will be characterized, among
others, by the removal of the walls between network services and network
infrastructure, by a significant OPEX reduction, and by network elasticity
and scalability for answering unexpected demands. Moreover, we will show
why most of the Communication Service Providers worldwide, and Sprint in
particular, have decide to build their own Carrier Cloud Infrastructure
and how they are changing the way they operate and manage their own networks.

The near future of a virtualized communication will be comprised of
combining networks and smart distributed cloud infrastructure (Carrier
Cloud Foundation) with many applications that will be delivered, as services,
on such foundations. Networks are to become software-driven and network
elements are planning to be transformed so that they can take advantage of cloud
infrastructure that is part of both foundations and applications. The
networks of tomorrow that realizes these capabilities will be flexible enough to
manage capacity in real-time based on a variety of factors such as cost, customer
priority and application characteristics in order to maximize customer
lifetime value and satisfaction. Building such a solution provides our industry
with smarter networks as well as the entire Networks/IT support systems
(e.g., OSS, BSS) that can be truly leveraged as a platform for create real
differentiation.

Bio:
As the CTO for Virtual Telecommunications David Amzallag is responsible
for the design, development, business transformation and the delivery of
the entire Alcatel-Lucent’s vision for Cloud Solutions and Virtual
Telecommunication. Prior to Alcatel-Lucent, David served as the Chief
Technology Officer and an Executive Vice President at Amdocs. In this role
he was responsible, among others, for planning and developing Amdocs’
technology vision as well as Amdocs’ future strategy, architecture and
roadmap of the entire product portfolio. Prior to Amdocs David served as
the Chief Scientist of 21st Century Network (21CN) of BT, where he was
responsible for designing, planning, and optimizing solutions in BT
networks, architectures and infrastructures in more than 100 countries and
250 networks worldwide. David has over 18 years of experience in leading,
developing, and transforming technological strategies, technological
innovation, and advanced solutions in academia, start-up companies, and
corporations worldwide in different technical domains. David holds a Ph.D.
in Computer Science (Technion, Israel), specializing in coping with
computationally-hard optimization problems. David has published papers and
book chapters, and presented in many academic and commercial conferences.

30/10/2012EE: Meyer 861

Self-Optimizing Microprocessors: A Machine Learning Approach

Prof. Jose F. Martinez

Cornell University

As each technology generation brings additional transistors, the computer industry
hopes to convert these into performance growth by stamping out a greater number of
cores on a die. On the one hand, in many environments, that seems like a lot of
hope. On the other hand, architecture researchers have grown almost allergic to
“complex” alternatives, which history has shown can quickly fall off the cliff of
diminishing returns.

A fundamental hurdle to bettering architectures may lie in the very perception of
complexity. Many past and current architectures are indeed complex, but often in
an unsophisticated way. Hardware designs tend to be very human-centric: decisions
are primarily taken by human beings at design time, based almost exclusively on
their experience and ability. Unsurprisingly, the operation of the adopted
mechanisms is typically confined to what a human can readily understand; and the
end product often falls short in potentially important capabilities, such as the
ability to plan ahead, to act successfully in previously unseen states, or to
improve automatically with experience.

At Cornell University, we are investigating architectures that possess and exploit
such capabilities, primarily by leveraging machine learning technology. Our
approach encourages the designer to focus more on the system variables and
constraints that may play a role in realizing a performance objective, rather than
formulating exactly how the hardware should accomplish such an objective. In this
way, we have, for example, devised self-optimizing memory controllers that
automatically adapt to changing software demands, delivering higher performance in
ways that may be unintuitive to a human. Our ultimate goal is better computers
through a more productive use of human ingenuity.
Bio:
Prof. Martinez earned MS (1999) and Ph.D. (2002) degrees in computer science from
the University of Illinois at Urbana-Champaign, where he returned in November of
last year to receive the department’s inaugural Distinguished Educator Award. He
is a senior member of the ACM and the IEEE, Acting Editor in Chief of IEEE
Computer Architecture Letters, as well as Associate Editor of ACM Trans. on
Computer Architecture and Code Optimization.

4/7/2012EE: Meyer 861

MinCopysets: Derandomizing Replication In Cloud Storage

Asaf Cidon

Stanford

Randomized node selection is widely used in large-scale,
distributed storage systems to both load balance chunks
of data across the cluster and select replica nodes to provide
data durability. We argue that while randomized
node selection is great for load balancing, it fails to protect
data under a common failure scenario. We present
MinCopysets, a simple, general-purpose and scalable
replication technique to improve data durability while retaining
the benefits of randomized load balancing. Our
contribution is to decouple the mechanisms used for
load balancing from data replication: we use randomized
node selection for load balancing but derandomize node
selection for data replication. We show that MinCopysets
provides significant improvements in data durability.
For example in a 1000 node cluster under a power outage
that kills 1% of the nodes, it reduces the probability
of data loss from 99.7% to 0.02% compared to random
replica selection. We implemented MinCopysets on two
open source distributed storage systems, RAMCloud and
HDFS, and demonstrate that MinCopysets causes negligible
overhead on normal storage operations.

In collaboration with Ryan Stutsman, Stephen Rumble, Sachin Katti,
John Ousterhout and Mendel Rosenblum.

Bio:
Asaf Cidon is an Electrical Engineering PhD candidate at Stanford
University, where he conducts research on data center and mobile
systems under Professors Mendel Rosenblum and Sachin Katti. He
worked at Google Israel in the web search team and previously
served as a Product Manager for two start-ups. Prior to his
studies, Asaf served as a team leader in an elite unit in the
Israeli Intelligence Forces. He received his MS in Electrical
Engineering from Stanford and BSc in Computer and Software
Engineering Cum Laude from the Technion. He is a recipient of the
Stanford Graduate Fellowship and Sohnis Promising Scientist Award.

27/6/2012CS: Taub 337

CORFU: A Shared-Log Design for Network-Attached Flash

Dr. Dahlia Malkhi

Microsoft Research, Silicon Valley

CORFU* is a novel storage cluster design that pools a farm of
flash units and exposes it to clients as a single, global
shared-log. CORFU utilizes flash to break the seeming tradeoff
between consistency and performance, providing both strong
consistency guarantees and high throughput. Our implementation
is carried mostly by a client-side library, thus relieving the
service from any IO bottlenecks; a CORFU cluster of 40 flash
units can drive roughly 1 million 4-KByte IOPS.

A shared log is a powerful abstraction, which empowers
high-performance, in-memory distributed applications that
derive coordination and reliability from a common source. We
demonstrate this through TOFU*, a distributed key-value store
built over CORFU, which supports multi-key transactions.

Our client-heavy design relieves the need to place expensive
machines and controllers between flash and clients. To
demonstrate the feasibility of directly attached CORFU storage
units to a network, we built a prototype FPGA-based flash unit
which attaches directly to a 1Gbps NIC, and consumes an order of
magnitude less energy than a PC-attached SSD.

* CORFU stands for Clusters of Raw Flash Units; it is also the
name of a Greek resort island near Paxos.
* TOFU stands for Transactions over CORFU.
CORFU team: Mahesh Balakrishnan, John Davis, Dahlia Malkhi,
Vijayan Prabhakaran, Ted Wobber
Thanks: Phil Bernstein, Ken Eguro, Jeremy Elson, Ed Nightingale,
Colin Reid, Michael Wei, Ming Wu

Bio:
Dahlia Malkhi, PhD, is a principal researcher at Microsoft
Research, Silicon Valley.

Dr. Malkhi works on algorithmic aspects of distributed computing
and reliability since the early nineties. Prior to joining Microsoft
Research, Dr. Malkhi was an associate professor at the Hebrew
University of Jerusalem (1999-2007) and a senior researcher at AT&T
Bell-Labs (1995-1999). Dr. Malkhi holds a PhD (1994) in computer science
from the Hebrew University. She has been serving as associate editor of
the Distributed Computing Journal since 2002, she is co-chairing the
coming LADIS 2012, and in the past has chaired ACM PODC 2006, DISC 2002
and co-chaired Locality 05 and Locality 07. Dr. Malkhi has been elected
as a fellow of the ACM in 2012; she is a winner of the IBM Faculty award
2003 and 2004; and a winner of the German-Israeli Foundation (G.I.F.)
Young Scientist award 2002.

20/6/2012CS: Taub 337

Sharing Virtual Memory between CPU and GPU

Dr. Boris Ginzburg

Intel Institute of Computational Intelligence

This talk will discuss some system issues related to sharing
virtual memory between CPU and GPU, which brings new
challenges: GPU page faults, flashing a TLB on GPU, etc. We
will also describe ESFIR – an SVM prototype, and XTHREAS – a
new SVM programming models, which extends pthread semantics
to GPU.

Bio:
Boris Ginsburg is an Intel principal engineer in Intel
Institute of Computational Intelligence. He has completed
his Ph.D. in Applied Math in Technion in 1997. After that he
joined Intel. During these 15 years he worked on CPU
architecture, Wireless Communication, and Formal Property
Verification.

13/6/2012EE: Meyer 1061

Improved Bounds for Byzantine Self-Stabilizing Clock Synchronization

Dr. Christoph Lenzen

Weizmann Institute of Science

The challenging task of Byzantine self-stabilizing pulse synchronization requires that, in the
presence of a minority of nodes that are permanently maliciously faulty, the non-faulty nodes must
start firing synchronized pulses in a regular fashion after a finite amount of time, regardless of the
initial state of the system. We study this problem under full connectivity in a model where nodes
have local clocks of unknown, but bounded drift, and messages are delayed for an unknown, but
bounded time.

We present a generic scheme that, given a synchronous consensus protocol P, defines a self-stabilizing
pulse synchronization algorithm A(P). If P terminates within R rounds (deterministically or with high
probability), A(P) stabilizes within O(R) time (deterministically or with high probability, respectively).
Utilizing different consensus routines, our transformation yields various improvements over previous
techniques in terms of stabilization time and bit complexity. Finally, we sketch how to establish the
abstraction of synchronous, integer-labeled rounds on top of pulse synchronization, at essentially the
same complexity bounds.

We will discuss our approach and its merits assuming no previous knowledge on the problem, however,
basic familiarity with consensus will be beneficial.

Bio:
Christoph Lenzen received a diploma degree in Mathematics from the University of Bonn, Germany,
and subsequently performed his graduate studies in Distributed Computing in the group of Professor Roger
Wattenhofer at ETH Zurich. In 2011, he was a postdoctoral Fellow at the Hebrew University of Jerusalem, with
Danny Dolev. Currently, he is a postdoctoral fellow at the Weizmann Institue of Science, with Professor David
Peleg. His research interests cover distributed computing in a wider sense, including topics such as randomized
load balancing, graph algorithms, and clock synchronization. He published e.g. at PODC, SPAA, FOCS, and STOC,
and in JACM. In 2009, he and his coauthors received the PODC best paper award for their work on gradient
clock synchronization.

30/5/2012EE: Meyer 861

Cells: A Virtual Mobile Smartphone Architecture

Dr. Oren Laadan

Cellrox

Smartphones are increasingly ubiquitous, and many users carry multiple
phones to accommodate work, personal, and geographic mobility needs.
We present Cells, a virtualization architecture for enabling multiple
virtual smartphones to run simultaneously on the same physical
cellphone in an isolated, secure manner. Cells introduces a usage
model of having one foreground virtual phone and multiple background
virtual phones. This model enables a new device namespace mechanism
and novel device proxies that integrate with lightweight operating
system virtualization to multiplex phone hardware across multiple
virtual phones while providing native hardware device performance.
Cells virtual phone features
include fully accelerated 3D graphics, complete power management
features, and full telephony functionality with separately assignable
telephone numbers and caller ID support. We have implemented a
prototype of Cells that supports multiple Android virtual phones on
the same phone. Our performance results demonstrate that Cells
imposes only modest runtime and memory overhead, works seamlessly
across multiple hardware devices including Google Nexus 1 and Nexus S
phones, and transparently runs Android applications at native speed
without any modifications.

Bio:
Dr. Oren Laadan is the CTO and co-founder of Cellrox
(http://www.cellrox.com), a startup company that provides the next
generation virtualization technology for mobile devices to create
multi-persona solutions on smartphones and tablets. A graduate of the
Israel Defense Forces elite “Talpiot” program, Oren pioneered R&D
nascent technologies in cloud computing, during his now 18-year
professional tenure in the fields of operating systems and
virtualization. Prior to co-founding Cellrox, He was a researcher at
Columbia University with a focus on computer systems, broadly defined,
including virtualization, operating systems, cloud computing,
security, reliability, and mobile computing. Oren is also a lead
developer of the Linux Checkpoint-Restart (linux-cr) project, which is
based in part on his research on operating system virtualization and
application checkpoint-restart. Previously, he was a core developer of
MOSIX for Linux, an extension of the Linux kernel for single-system
image clustering and automatic load balancing. Oren holds a Ph.D. in
Computer Science from Columbia University as well as an M.Sc. in
Computer Science and a B.Sc. in Physics and Mathematics from Hebrew
University.

23/5/2012 14:30EE: Meyer 1003

Dr. Philip M. Merlin Memorial Lecture and Prize Award
Scheduling Algorithms for the 4G Cellular Networks

Prof. Reuven Cohen

Computer Science Department, Technion

Cellular networks are becoming more and more crucial to our daily
lives and operators are seeking new technologies for increasing their
bandwidth. Examples for such technologies are cell sectorization,
fractional frequency reuse, and coordinated multipoint Tx/Rx. To take
advantage of these new technologies, the scheduler logic at the base
station needs to determine not only when to transmit each packet but
also what modulation and coding scheme to use, which frequency reuse
area’s resources, and by which transmitting antenna. In this talk, I
will discuss all these modern scheduling problems and present
algorithms for solving them

Bio:
Reuven Cohen
received the Ph.D. degree from the Technion in 1991.
Since 1993, he has been a
professor in the Department of Computer Science at the Technion, working on
protocols and architectures for computer networks.
He has served as an editor of the
IEEE/ACM Transactions on Networking and the
ACM/Kluwer Journal on Wireless Networks (WINET). He was
the technical program co-chair of IEEE Infocom 2010.

16/5/2012CS: Taub 337

Dynamic Reconfiguration of Primary/Backup Clusters

Dr. Alex Shraer

Yahoo! Research

Dynamically changing (reconfiguring) the membership of a replicated
distributed system while preserving data consistency and system
availability is a challenging problem. In this talk I will discuss
this problem in the context of Primary/Backup clusters and
Apache Zookeeper. Zookeeper is an open source system which enables
highly reliable distributed coordination. It is widely used in
industry, for example in Yahoo!, Facebook,Twitter, VMWare, Box,
Cloudera, Mapr, UBS, Goldman Sachs, Nicira, Netflix and many others. A
common use-case of Zookeeper is to dynamically maintain membership and
other configuration metadata for its users. Zookeeper itself is a
replicated distributed system. Unfortunately, the membership and all
other configuration parameters of Zookeeper are static – they’re
loaded during boot and cannot be altered. Operators resort to
”rolling restart” – a manually intensive and error-prone method of
changing the configuration that has caused data loss and inconsistency
in production. Automatic reconfiguration functionality has been
requested by operators since 2008. Several previous proposals were
found incorrect and rejected. We designed and implemented a new
reconfiguration protocol in Zookeeper and are currently integrating it
into the codebase. It fully automates configuration changes: the set
of Zookeeper servers, their roles, addresses, etc. can be changed
dynamically, without service interruption and while maintaining data
consistency. By leveraging the properties already provided by
Zookeeper our protocol is considerably simpler than state of the art
in reconfiguration protocols. Our protocol also encompasses the
clients — clients are rebalanced across servers in the new
configuration, while keeping the extent of migration proportional to
the change in membership.

This is a joint work with Benjamin Reed (Yahoo!), Dahlia Malkhi (MSR)
and Flavio Junqueira (Yahoo!). A paper describing this work will
appear in the 2012 Usenix ATC conference and in the 2012 Hadoop
Summit. http://www.cs.technion.ac.il/~shralex/zkreconfig.pdf

Bio:
Bio: Alexander (Alex) Shraer is a Research Scientist in Yahoo!. His
research interests include large-scale and dynamic distributed
systems, various aspects of fault-tolerance, secure cloud storage,
publish-subscribe systems, etc. Alex received his B.Sc. (2004, summa
cum laude) and M.Sc. (2006, cum laude) degrees in Computer Science and
PhD in Electrical Engineering (2010) from the Technion, Israel. He is
a recipient of the Levi Eshkol 3-year PhD fellowship, awarded by the
Israeli Ministry of Science.

9/5/2012EE: Meyer 861

Self-Adjusting Networks and Distributed Data Structures

Dr. Chen Avin

Department of Communication Systems Engineering, Ben Gurion University

We study self-optimizing networks and data structures. The goal of this research line is
the design of fully distributed algorithms which flexibly adapt the network to a dynamic
environment such as changing demands or traffic patterns. The idea of self-adjusting networks
is motivated by trends in today’s Internet like large data centers and peer-to-peer networks.

In this talk I will preset the general model of the problem and some initial theoretical
results on data centers placement and splay networks and the connection to well known
concepts such as minimum linear arrangement, splay trees and Entropy.

Joint work with: Michael Borokhovich (BGU), Bernhard Haeupler (MIT), Zvi Lotker (BGU),
Christian Scheideler (U. Paderborn) and Stefan Schmid (T-Labs).

Bio:
Dr. Chen Avin received the B.Sc. degree in Communication Systems Engineering from Ben Gurion University,
Israel, in 2000. He received the M.S. and Ph.D. degrees in computer science from the University of
California, Los Angeles (UCLA) in 2003 and 2006 respectively. He is a senior lecturer in the Department
of Communication Systems Engineering at the Ben Gurion University for which he joined on October 2006.
His current research interests are: graphs and networks algorithms, modeling and analysis with emphasis
on wireless networks, complex systems and randomize algorithms for networking.

18/4/2012CS: Taub 337

Green Cellular Networks: A Survey, Some Research Issues
and Challenges

Vijay K. Bhargava

University of British Columbia and President, IEEE Communications Society

In this talk, we present techniques to enable green
communications in future generation of wireless systems
that will rely on cooperation and cognition to meet
increasing demand of high data rate. So far, achieving
high data rate has been the primary focus of research in
cooperative and CR systems, without much consideration of
energy efficiency. However, many of these techniques
significantly increase system complexity and energy
consumption. Escalating energy costs and environmental
concerns have already created an urgent need for more
energy-efficient “green” wireless
communications. Therefore, we need to design
energy-efficient solutions for cooperative and cognitive
networks, which will potentially drive the future
generation of wireless communication. We focus on several
important topics that are crucial towards reducing the
energy consumption of the cognitive and cooperative
networks. These topics include efficient base station
redesign, heterogeneous network deployment, green
communications via cognitive radio, cooperative relays to
deliver green communications, and energy efficient
cognitive cooperative networks.

Bio: Vijay Bhargava, an IEEE Volunteer for three
decades, is Professor in the Department of Electrical and
Computer Engineering at the University of British Columbia
in Vancouver, where he served as Department Head during
2003-2008. Previously he was with the University of
Victoria (1984-2003) and Concordia University
(1976-84). He received his Ph.D. from Queen’s University
in 1974. He is a fellow of the IEEE, the Royal Society of
Canada, the Canadian Academy of Engineering and the
Engineering Institute of Canada

Vijay has served on the Board of Governors of the IEEE
Information Theory Society and the IEEE Communications
Society. He has held important positions in these
societies and has organized conferences such as ISIT’83,
ISIT’95, ICC’99 and VTC 2002 Fall. He played a major role
in the creation of the IEEE Transactions on Wireless
Communications, and served as its editor-in-chief during
2007, 2008 and 2009. He is a past President of the IEEE
Information Theory Society and is currently serving as the
President of the IEEE Communications Society.

28/3/2012CS: Taub 401

Improving Virtualization Performance Through Physical
Transparency

Nadav Amit

Technion CS

Machine virtualization, where many virtual machines run on
a single physical machine, has become an increasingly
popular area of research and development in the last
decade. Despite the introduction of hardware support for
machine virtualization in commodity servers, many
workloads still suffer from degraded performance when run
in virtual machines. This degradation can be particularly
acute when an unmodified virtual machine — a VM that is
not aware it is running in a virtual environment — is
used to run the workload. The degradation is is caused
primarily by the lack of physical hardware transparency:
instead of exposing the underlying physical hardware
transparently to virtual machines, the hypervisor — the
software layer that controls the virtual machines —
usually exposes hardware “abstractions” instead. This talk
will present results from two recent projects that
improved VM performance by increasing physical
transparency: the first increased device emulation
throughput threefold using sidecore emulation, and the
second improved I/O throughput by 30%-60% using direct
interrupt delivery to VMs.

Joint work with Muli Ben-Yehuda, Abel Gordon, Nadav
Har’El, Alex Landau, Assaf Schuster, and Dan Tsafrir.

Bio: Nadav Amit is currently a PhD student in the
Technion’s computer science faculty, conducting research in
the area of machine virtualization under the supervision of
Prof. Assaf Schuster. Before coming to the Technion, he spent
five years in industry working in Intel on development of
random instruction tests generator for the validation of
virtualization and power management features. He is the
recipient of the prestigious IBM Fellowship for PhD students.

19/3/2012 15:30EE: Meyer 861

PCI Express: Technology, Road Map and Design Challenges

Rick Eads

Agilent Technologies; Member of PCI-SIG board of directors

PCI Express is an industry standard, and the most prominent
interconnection architecture for I/O devices and other boards inside a
computer. It is defined and updated by the PCI-SIG(R) industry standards
body, originally formed in 1992 as the Peripheral Component Interconnect
(PCI) special interest group (SIG). Agilent Technologies, a prominent
developer and provider of test equipment at all levels yet not a
competitor in the computer market, has played a key role in the
advancement of PCI Express as a high performance yet extremely reliable
interconnection scheme with very good interoperability among
different-vendor equipment.

This talk will provide the following: an overview of PCI Express; PCI
Express design and debugging challenges; Agilent tools for PCI Express
development (physical layer, signal integrity and protocol validation);
PCI Express electrical and protocol testing; and a live demo of testing
solutions.

Bio:
Rick Eads, a senior program manager for PCI Express tools at Agilent
Technologies, serves as Agilent’s PCI Express Technology lead. Since 2007,
he has served on the board of directors of PCI-SIG.

8/2/2012EE: Meyer 861

Synthesizing Concurrent Relational Data Structures

Dr. Roman Manevich

UT Austin

Efficient concurrent data structures are extremely important for obtaining
good performance for most parallel programs. However, ensuring the
correctness of concurrent data structure implementations can be very tricky
because of concurrency bugs such as race conditions and deadlocks. In systems
that use optimistic parallel
execution such as boosted transactional memory systems and the Galois system,
the implementation of concurrent data structures is even more complex because
data structure implementations must also detect conflicts between concurrent
activities and support the rollback of conflicting activities.

At present, these types of concurrent data structures are implemented manually
by expert programmers who write explicitly parallel code packaged into
libraries for use by application programmers. This solution has its limitations;
for example, it does not permit the customization or tuning of a data structure
implementation for a particular application.

In this talk, we present Autograph, which is the first concurrent data structure
compiler that can synthesize concurrent relational data structure
implementations for use by application programmers. The input to Autograph is a
high-level declarative specification of an abstract data type (ADT); the output
is a concurrent implementation of that ADT with conflict detection and rollback
baked in. Our synthesizer is parameterized by a set of data structures called
“tiles”, which are building blocks that the compiler composes to create the
overall data structure. Application programmers can use a simple expression
language to tune the composition of these tiles, thereby exercising high level,
fine-grain control of data structure implementations.
We have used Autograph to synthesize concurrent sparse graph data structures for
a number of complex parallel graph benchmarks. Our results show that the
synthesized concurrent data structures usually perform better than the handwritten
ones; for some applications and thread counts, they improve performance by a
factor of 2.

bio:
Roman Manevich is a postdoctoral fellow at the university of Texas, Austin.
Roman’s research interests are software verification of sequential and concurrent
programs manipulating dynamically allocated (heap) memory by static analysis, and
synthesis of sequential and concurrent programs.

He obtained a B.Sc. Cum Laude in Computer Science in 2001, an M.Sc. Cum Laude in
Computer Science in 2003, and Ph.D. in Computer Science in 2009, all from Tel Aviv
University. His doctoral dissertation concerns the problem of static analysis of
heap-manipulating programs for an unbounded number of objects and parallel processes.
In 2009 he conducted post-doc research in UCLA on verifying transactional memories
for an unbounded number of threads. He was cited in the Dean’s list in 2000 as an
outstanding undergraduate student and received an outstanding graduate student award
in 2005. He was awarded the Clore Fellowship for outstanding Ph.D. students in
Israel, in 2005-2008.

Roman has worked in i-Logix (now part of IBM) in 2001-2003 on adding automatic web-connectivity to embedded devices for UML models using code generation. He has interned in MSR Redmond, MSR India, and in IBM Watson, developing efficient static analysis techniques.

18/1/2012EE: Meyer 861

Distributed Average Consensus and Coherence in Dynamic Networks

Dr. Stacy Patterson

Technion EE

In the distributed average consensus problem, each node in a network has an
initial value, and the objective is for all nodes to reach consensus at the
average of these values using only communication with nearby nodes.
Distributed average consensus algorithms have a wide variety of applications,
including distributed optimization, sensor fusion, load balancing, and
autonomous vehicle formation control.

This talk centers on the analysis of distributed averaging algorithms for
consensus and vehicle formation control in networks with dynamic
characteristics such as stochastic packet loss, node failures, network
partitions, and additive disturbances. I will present an overview of
distribued averaging algorithms and some recent results on the stability and
robustness of these algorithms in dynamic networks. We analyze the
relationship between algorithm performance and the size, dimension, and
dynamic characteristics of the network, and we show that network dimension
imposes fundamental limitations on algorithm performance in large networks.

bio: Dr. Stacy Patterson received her B.A in Mathematics and B.S. in
Computer Science from Rutgers University in 1998 and her M.S. and Ph.D. in
Computer Science from the University of California, Santa Barbara in 2003
and 2009 respectively. From July 2009 to August 2011, she was a postdoctoral
scholar at the Center for Control, Dynamical Systems, and Computation at the
University of California, Santa Barbara. She is currently a postdoctoral
fellow in the Department of Electrical Engineering at the Technion, working
with Professor Idit Keidar. Her research interests include distributed
systems, vehicle networks, and cloud computing.

11/1/2012CS: Taub 337

Persistent OSPF attacks

Gabi Nakibly

National EW Research and Simulation Center

Open Shortest Path First (OSPF) is the most popular interior
gateway routing protocol on the Internet. Most known OSPF
attacks that have been published in the past are based on
falsifying the link state advertisement (LSA) of an
attacker-controlled router. These attacks can only falsify a
small portion of the routing domain’s topology, hence their
effect is usually limited. More powerful attacks are the
ones that affect LSAs of other routers not controlled by the
attacker. However, these attacks usually trigger the OSPF
“fight-back” mechanism by the victim router which advertises
a correcting LSA, making the attacks’ effect
non-persistent. In this work we present new OSPF attacks
that exploit design vulnerabilities in the protocol
specification. These new attacks can affect the LSAs of
routers not controlled by the attacker while evading
“fight-back”. As a result, an attacker can persistently
falsify large portions of the routing domain’s topology
viewed by other routers thereby giving the attacker control
over their routing tables. We discuss a number of mitigation
strategies and propose an update to the OSPF specification
that defeats these attacks and improves OSPF security. The
talk is based on results presented at Black Hat ’11 and NDSS
’12. This is a joint work with Alex Kirshon, Dima Gonikman
and Dan Boneh.

bio: Gabi Nakibly is a professional fellow in the
National EW Research and Simulation Center at Rafael —
Advanced Defense Systems. He also serves as an adjunct
researcher and lecturer in the Computer Science department
at the Technion. Gabi received his Ph.D. in computer Science
in 2008 from the Technion, and he is a recipient of the
Katzir Fellowship. His main research interests include
network security and traffic engineering.

4/1/2012CS: Taub 9

Graph matching and clustering on the GPU

Rob Bisseling

Utrecht University

Graph matching is the problem of matching nodes of a graph
in pairs such that the largest number of pairs is created or
the largest sum of edge weights is obtained. Greedy graph
matching provides us with a fast way to coarsen a given
graph during graph partitioning. Direct algorithms on the
CPU which perform such greedy matchings are simple and fast,
but offer few handholds for parallelisation. To remedy this,
we introduce a fine-grained shared-memory parallel algorithm
for greedy matching, together with an implementation on the
GPU, which is faster than the CPU algorithms and produces
matchings of similar or better quality.

Another graph problem with much practical interest is that
of graph clustering, where highly connected clusters with
many internal edges and few external edges are
determined. Agglomerative clustering is an effective greedy
way to quickly generate graph clusterings of high
quality. We introduce a fine-grained shared memory algorithm
for agglomerative clustering on both the CPU and GPU. This
heuristic algorithm is able to generate clusterings in very
little time: an optimal clustering is obtained from a street
network graph with 14 million vertices and 17 million edges
in six seconds on the GPU. The clustering work was done as
part of the 10th DIMACS challenge on graph partitioning and
clustering.

Joint work with Bas Fagginger Auer.

bio: Rob Bisseling is a professor in scientific
computing at the Mathematics Institute of Utrecht University,
the Netherlands. He is author of the book, “Parallel
Scientific Computation: A Structured Approach using BSP and
MPI”, Oxford University Press, March 2004. He is co-author of
the BSPlib communications library and the Mondriaan sparse
matrix partitioning package. His reserach interests are: graph
algorithms, sparse matrix computations, parallel computing,
polymer simulations.


http://www.staff.science.uu.nl/~bisse101/

.

3/1/2012 12:45CS: Taub 337

Load Balancing for SIP Server Clusters

Erich Nahum

IBM T.J. Watson Research Center

The Session Initiation Protocol (SIP) is widely used for
controlling communication sessions such as voice and video
calls over IP, video conferencing, streaming multimedia
distribution, instant messaging, presence information, file
transfer and online games. This work introduces several
novel load balancing algorithms for distributing SIP
requests to a cluster of SIP servers. Our load balancer
improves both throughput and response time versus a single
node, while exposing a single interface to external clients.
We present the design, implementation and evaluation of our
system using a cluster of Intel x86 machines running Linux.
We compare our algorithms with several well-known approaches
and present scalability results for up to 10 nodes. Our best
algorithm, Transaction Least-Work-Left (TLWL), achieves its
performance by integrating several features: knowledge of
the SIP protocol; dynamic estimates of back-end server load;
distinguishing transactions from calls; recognizing
variability in call length; and exploiting differences in
processing costs for different SIP transactions. By
combining these features, our algorithm provides
finer-grained load balancing than standard approaches,
resulting in throughput improvements of up to 24 percent and
response time improvements of up to two orders of magnitude.
We present a detailed analysis of occupancy to show how our
algorithms significantly reduce response time.

Reference: This work appeared in IEEE INFOCOM in 2009
and is accepted to appear in IEEE/ACM Transactions on
Networking. This research is joint with Hongbo Jiang of
Huazong University of Science and Technology, and Arun
Iyengar, Wolfgang Segmuller, Asser Tantawi, and Charles
P. Wright of the IBM T.J. Watson Research Center.

bio: Erich Nahum is a research staff member in the
IBM T.J. Watson Research Center. He received his Ph.D. in
Computer Science from the University of Massachusetts,
Amherst in 1996, advised by Jim Kurose and Don Towsley. His
research has included work on SIP, HTTP, Instant Messaging,
TCP/IP, workload characterization, workload generation,
overload control, dynamic content, clusters,
multiprocessors, operating systems, and security. He is
interested in all aspects of performance in experimental
networked systems. More detail can be found
at
http://researcher.ibm.com/researcher/view.php?person=us-nahum

.

28/12/2011EE: Meyer 861

Distributed computing: bridging the gap between theory and practice.

Yuval Emek

ETH

The theory of distributed computing, which lies at the heart of understanding
the power and limitations of distributed systems, underwent tremendous progress
over the last few decades.
Despite this progress, there seems to be a widening gap between the traditional
models on top of which the theory of distributed computing is built and the
real-world problems we wish to investigate through these models.
In this talk we will discuss the different aspects of this widening gap and
present some of the efforts made in attempt to adjust the field of theoretical
distributed computing to the rapidly changing needs of its practical applications.
In particular, we will focus on recent advances in the study of wireless networks
and on a new model for decentralized networks of “unorthodox” devices.
The talk will be self-contained.

bio:
Yuval Emek graduated summa cum laude from the Technion – Israel Institute of
Technology with a bachelor degree in computer science and completed his master
studies and Ph.D. in computer science at the Weizmann Institute.
Following that, he spent one year as a post-doc at Tel Aviv University and
another year in Microsoft.
He currently holds a post-doc position at ETH Zurich, Switzerland.
In his scientific work, Yuval studies various aspects of complex decentralized
systems with an emphasis on theoretical distributed computing.

21/12/2011 12:00EE: Meyer 861

MARS: Adaptive Remote Execution Scheduler for Multithreaded
Mobile Devices

Tomer London

Stanford University, Dept. of Electrical Engineering

Mobile devices face a growing demand to support
computationally intensive applications like 3D graphics and
computer vision. However, these devices are inherently
limited by processor power density and device battery
life. Dynamic remote execution addresses this problem, by
enabling mobile devices to opportunistically offload
computations to a remote server. We envision remote
execution as a new type of cloud-based heterogeneous
computing resource, or a “Cloudon-Chip”, which would be
managed as a system resource as if it were a local CPU, with
a highly variable wireless interconnect. To realize this
vision, we introduce MARS, the first adaptive, online and
lightweight RPC-based remote execution scheduler supporting
multi-threaded and multi-core systems. MARS uses a novel
efficient offloading decision algorithm that takes into
account the inherent trade-offs between communication and
computation delays and power consumption. Due to its
lightweight design, MARS runs on the device itself,
instantly adapts its decisions to changing wireless
resources, and supports any number of threads and cores. We
evaluated MARS using a trace-based simulator driven by real
world measurements on augmented reality, face recognition
and video game applications. MARS achieves an average
speedup of 57% and 33% higher energy savings over the best
static client-server partitions.

bio: Tomer
London
is a Stanford Graduate Fellow in the Electrical
Engineering PhD program at Stanford University, working on
mobile computing with
Professors Christos
Kozyrakis

and Sachin
Katti
. Previously, he was the co-founder and CEO of
Vizmo Mobile Inc., an enterprise software startup. Tomer
also served as a researcher at Intel Microarchitecture
Technology Labs, and as a product lead in Bump
Technologies. Tomer received his BSc from Technion
Electrical Engineering dept. in ’10 with Summa Cum Laude
honors and is the recipient of the Robert Bosche Stanford
Graduate Fellowship, Sohnis Promising Scientist Award, and
the Freescale Award ’10.

21/12/2011 11:30EE: Meyer 861

Flashback: A New Control Channel for Wireless Networks

Asaf Cidon

Stanford University, Dept. of Electrical Engineering

Unlike wired network protocols, Wi-Fi does not separate the
data channel from the control channel, since only a single
sender and receiver can communicate at a given time
slot. Flashback is a system that allows multiple
transmitters to send ‘flashes’ of high power OFDM
sub-carriers without affecting the normal data transmissions
on the Wi-Fi channel. By taking advantage of SNR margins of
Wi-Fi channel codes, the flashes do not impose any overhead
on the regular data transmission. Using this simple PHY
mechanism, Flashback provides higher networking layers with
a multi-user control channel, which can be used to
significantly improve existing Wi-Fi protocols. During the
talk we will share the results from our implementation of
Flashback, and discuss their implications on the higher
layers of the Wi-Fi stack.

bio: Asaf
Cidon
is an Electrical Engineering PhD and MBA student
at Stanford University, where he conducts research on
data-center and mobile systems under
Professors Mendel
Rosenblum

and Sachin
Katti
. He worked at Google Israel and helped
develop Google
Related
and previously served as a Product Manager for
two start-ups. Prior to his studies, Asaf served as a team
leader in an elite unit in the Israeli Intelligence
Forces. He received his BSc in Computer and Software
Engineering Cum Laude from the Technion, and is a recipient
of the Stanford Graduate Fellowship and Sohnis Promising
Scientist Award.

14/12/2011EE: Meyer 1061

Multi-party Computation Forever, for Cloud Computing and Beyond

Shlomi Dolev

Math and CS, Ben-Gurion University

Three works will be described. In the first we present
reactive secret sharing, that changes the secret according
to unbounded sequence of common inputs, where no
communication among the (dynamic set of) participants is
allowed, we present a fully secure solution for simple
functions but somewhat non secure solution for any
function.. In the second work dynamic on-going multiparty
computation, in which we consider the case of dynamic group
of participants that should not know the sequence of inputs
of the others, as well as should not know the function
performed on the inputs. In the last work we consider the
case of infinite execution with no communication among the
participants where we prove that any automaton can be
executed in a fully secure fashion, the construction is
based on Krohn-Rhodes decomposition technique.

Joint works with Limor Lahiani, Moti Yung, Juan Garay, Niv
Gilboa and Vladimir Kolesnikov

bio:Shlomi Dolev received his B.Sc. in Engineering
and B.A. in Computer Science in 1984 and 1985, and his
M.Sc. and D.Sc. in computer Science in 1990 and 1992 from
the Technion Israel Institute of Technology. From 1992 to
1995 he was at Texas A and M University postdoc of Jennifer
Welch. In 1995 he joined the Department of Mathematics and
Computer Science at Ben-Gurion University where he is now a
full professor and the dean of natural sciences.

He was a visiting researcher/professor at MIT, DIMACS, and
LRI, for several periods during summers. Shlomi is the
author of the book “self-stabilization” published by the MIT
Press. He published two hundreds journal and conference
scientific articles, and patents. Shlomi served in the
program committee of more than 60 conferences including: the
ACM Symposium on Principles of Distributed Computing, and
the International Symposium on DIStributed Computing. He is
an associate editor of the IEEE Transactions on Computers,
the AIAA Journal of Aerospace Computing, Information and
Communication and a guest editor of the Distributed
Computing Journal and the Theoretical Computer Science
Journal. His research grants include IBM faculty awards,
Intel academic grants, Verisign, ISF, US Airforce, EU and
NSF grants.

Shlomi is the founding chair of the computer science
department at Ben-Gurion University, where he now holds the
Rita Altura trust chair in computer science. His current
research interests include distributed computing,
distributed systems, security and cryptography and
communication networks; in particular the self-stabilization
property of such systems. Recently, he is involved in
optical computing research.

7/12/2011CS: Taub 337

Title: Crafting Fast Wait-Free Algorithms

Alex Kogan

Technion, CS

Lock-freedom is a progress guarantee that ensures overall
program progress. Wait-freedom is a stronger progress
guarantee that ensures the progress of each thread in the
program. The latter property is valuable for systems that
need to be responsive, such as real-time systems, operating
systems, etc. While many practical lock-free algorithms are
known in the literature, constructing wait-free algorithms
is considered a complicated task that results in inefficient
implementations. In this talk, we present first construction
of a wait-free queue and a methodology called
fast-path-slow-path, which allows the wait-free algorithms
(and our queue in particular) to practically match the
performance of the lock-free ones. Subsequent work has used
this methodology to create other efficient wait-free
algorithms.

bio: Alex Kogan is a Ph.D. student in the Department
of Computer Science at the Technion. He holds M.Sc. (2008)
and B.A (Summa Cum Laude, 2002) degrees from the same
department. His research interests lie at the intersection
of distributed and mobile computing, and in parallel
computing. Specifically, he is interested in constructing
efficient and reliable distributed algorithms for mobile
networks, utilizing emergent technologies for wireless
communication and the availability of cloud
services. Recently, he became interested in exploring and
boosting the performance of modern multi-core systems by
devising efficient concurrent algorithms.

30/11/2011EE: Meyer 861

Replicate and Bundle (RnB) – A Relief for Certain Data Center Bottlenecks

Shachar Raindel

Technion, EE

In this talk, we present the Replicate and Bundle (RnB) scheme for
relieving back-end processor and network bottlenecks in read-mostly
key-value storage systems wherein each user request spawns a large
number of back-end small-item requests. This is common in Web 2.0 and
online social network systems. Adding processors is of little help
because this increases the number of back-end requests per user
request, thereby also increasing the overall processor and network
load. Instead, RnB adds memory and replicates the stored data, thereby
permitting each user request to be serviced through fewer back-end
requests, reducing the total amount of work required from the
network-processing and storage components of the system. The scheme is
very easy to deploy, completely distributed, and while oblivious to
the workload, beneficially exploits its spatial locality.

We have studied RnB through simulation, and augmented this with a
micro-benchmark in order to estimate the expected actual system
performance.

Our results show that by using RnB, two major scaling bottlenecks are
considerably relieved: 1) the TCP InCast issue and 2) the memcached
multi-get hole. When utilizing the requests’ locality, we achieve a
considerable improvement in these bottlenecks with relatively little
additional memory. In fact, RnB can use existing replicas (included
for fault tolerance), in which case the extra cost is minimal if any.
Finally, we note that this is yet another example of judicious
exploitation of redundancy for performance enhancement.

bio:
Shachar Raindel holds a BSc in Computer Engineering from the Technion.
He served in the IDF in a technical R&D position. He is now completing
his MSc thesis at the Technion’s Electrical Engineering department
under the guidance of Prof. Yitzhak Birk.

23/11/2011CS: Taub 337

Self-stabilizing Autonomic Recoverers

Dr. Olga Brukman

Technion, CS

This talk introduces theoretical foundations for system architectures and
algorithms for creating truly robust autonomic systems — systems that are able
to recover automatically from unexpected failures. We consider various settings of
system transparency. We consider black box and transparent box software packages.
The general assumption is that a software package fails when it encounters an unexpected environment state —
a state the package was not programmed to cope with.
Creating a system that anticipates every possible environment state is not feasible due to the
size of the environment. Thus, an autonomic system design should imply that
a system is able to overcome an unexpected environment state either by executing
a recovery action that restores a legal state or by finding a new program that
respects the specifications and achieves the software package goals in the current environment.

In the first part of this talk, we consider software packages to be black boxes.
We propose modeling software package flaws (bugs) by assuming eventual Byzantine
behavior of the package. A general, yet practical, framework and paradigm for the monitoring
and recovery of systems called autonomic recoverer is proposed.
In the second part we consider a software package to be a transparent box and
introduce the recovery oriented programming paradigm.
Programs designed according to the recovery oriented programming
paradigm include important safety and liveness properties and recovery actions as an
integral part of the program. We design a pre-compiler that
produces augmented code for monitoring the properties and executing the recovery actions
upon a property violation. Finally, in the third part, we consider a highly dynamic environment,
which typically implies that there are no realizable specifications for the environment,
i.e., there does not exist a program that respects the specifications for every given environment.
We suggest searching for a program in run time by trying all possible programs on environment
replicas in parallel. We design control search algorithms that exploit various environment properties.

bio:
Olga had graduated with PhD in Computer Science under supervision of Prof.
Shlomi Dolev from Ben-Gurion University, Beer-Sheva, Israel in 2008. The
main focus of her work was exploring formal definitions of the system
architectures for autonomic systems and using techniques from
self-stabilization to create true self-recovery systems. Since then she
worked in Microsoft Israel in PC Health group on creating monitoring and
recovery scenarios for PC Advisor. Next, she spent a year in Deutsche Telekom
Labs in Ben-Gurion University working on spam mitigation techniques and
security concerns in IPv6. Then, she was with Inifinidat.com, a storage
systems startup.

16/11/2011EE: Meyer 861

Observations on Linux Development

Prof. Dror Feitelson

Computer Science, Hebrew University

Linux is used extensively in systems research as a platform for the
implementation of new ideas, exploiting its open-source nature. But Linux
is also interesting as an object of study about software engineering. In
particular, Linux defies common management theories, as it lacks any
coherent plan or management structure, but still grows at an
ever-increasing rate, while also gaining market share. We will review some
previous studies of Linux development and add new observations regarding
the lifecycle model and code complexity.

The presentation includes results of my students Ayelet Israeli, Ahmad
Jbara, and Adam Matan.

bio:
Dror Feitelson has been on the faculty of the School of Computer Science
and Engineering at the Hebrew University of Jerusalem since 1995. His major
contributions have been in parallel job scheduling (where he co-founded the
JSSPP series of workshops) and workload modeling (where he maintains the
parallel workloads archive). Recently he has also started to work on
software engineering, and in particular the development of open-source
systems like Linux.

26/10/2011EE: Meyer 1061

On Distributed Coordination Strategies in Cooperative Wireless Networks

Dr. Lavy Libman

School of Information Technologies, University of Sydney

Cooperative and opportunistic communication techniques in
wireless networks promise significant performance benefits over
traditional methods that do not exploit the broadcast nature of
wireless transmissions. However, such techniques generally require
coordination among the participating nodes, e.g. to discover available
neighbors or negotiate the best course of action after every packet
broadcast. The associated coordination overheads negate much of the
cooperation benefits for applications with strict latency requirements,
or in networks with highly dynamic topologies. This talk will highlight
distributed cooperation strategies, where each node overhearing a packet
decides independently whether to retransmit it, without explicit
coordination with the source, intended receiver, or other
neighbours in the vicinity. We formulate two non-traditional
optimization problems that arise in this context, present their
solution methodology and demonstrate the superior performance attained
by the respective strategies.

Bio:
Lavy Libman is a Visiting Scientist (during the winter semester of 2011-2012)
at the Department of Electrical Engineering, Technion – Israel Institute of
Technology, where he also received his PhD in 2003. Since 2009, he is a senior
lecturer at the School of Information Technologies, University of Sydney. He
also continues to be associated with the Networks research group in NICTA (the
national ICT research institute in Australia), where he was a researcher
between 2003-2009. His research revolves around the design and optimization of
wireless networks, with a particular interest in cooperative and opportunistic
techniques, protocols for resource-limited devices, and game-theoretic
modeling. Dr. Libman is a senior member of the IEEE Communications Society and
is serving as an associate editor of IEEE Transactions on Wireless
Communications. He is a publicity co-chair of IEEE Infocom 2012, served as a
program co-chair of ICCCN 2010 and WiOpt 2010, and continues to be involved as
a program committee member of several international conferences, including IEEE
Infocom, IEEE LCN, ACM MSWiM, and WiOpt.

6/7/2011CS: Taub 3

Highly Efficient Synchronization Techniques

Prof. Panagiota Fatourou

Foundation for Research and Technology-Hellas, Institute of Computer Science

We present highly-efficient synchronization techniques and
experimentally show that these techniques outperform most
state of the art lock-based and lock-free synchronization
mechanisms. One of the techniques ensures, in addition,
wait-freedom.

We have used these techniques to implement common concurrent
data structures, like stacks and queues. Our experiments show
that these data structure implementations have much better
performance than state of the art shared stack and queue
implementations which ensure only weaker progress properties
than one of our techniques.

Talk slides.

Bio: Panagiota Fatourou is an Assistant Professor at
the Department of Computer Science, University of Crete and an
affiliated faculty member of the Institute of Computer Science
(ICS) of the Foundation for Research and Technology-Hellas
(FORTH). Prior to joining the University of Crete and FORTH
ICS, she was a full-time faculty member at the Department of
Computer Science of the University of Ioannina. The academic
years 2000 and 2001, she was a postdoc at Max-Planck Institut
fur Informatik, Saarbrucken, Germany, and at the Computer
Science Department of the University of Toronto, Canada. She
got a degree in Computer Science from the University of Crete,
and a PhD degree in Computer Engineering from the University
of Patras. Her research interests focus on the theory of
parallel and distributed computing.

22/6/2011EE: Meyer 861

Network Science – A Network of Sciences

Prof. Ariel Orda

Technion, Department of Electrical Engineering

Network Science is a newly emerging discipline with applications in a
variety of domains, such as Communication Networks, Power Grid Networks,
Transportation Networks, Social Networks and Biological Networks. Focusing on
communication networks, we shall discuss what network science should be and what
it should consist of. The talk will also feature some historical anecdotes, tracing
back to ancient times. Further details would be a spoiler.

Bio:
Ariel Orda is a professor at the Department of Electrical
Engineering in the Technion. His research interests include network
routing, survivability, QoS provisioning, wireless networks, the
application of game theory to computer networking and network pricing.
For more information, see
http://webee.technion.ac.il/Sites/People/ArielOrda/.

15/6/2011EE: Meyer 861

Estimating Sizes of Social Networks via Biased Sampling

Dr. Liran Katzir

Yahoo! Research Haifa

Online social networks have become very popular in recent years and their number of users is already measured in many hundreds of millions. For various commercial and sociological purposes, an independent estimate of their sizes is important. In this work, algorithms for estimating the number of users in such networks are considered. The proposed schemes are also applicable for estimating the sizes of networks’ sub-populations.

The suggested algorithms interact with the social networks via their public APIs only, and rely on no other external information. Due to obvious traffic and privacy concerns, the number of such interactions is severely limited. We therefore focus on minimizing the number of API interactions needed for producing good size estimates. We adopt the abstraction of social networks as undirected graphs and use random node sampling. By counting the number of collisions or non-unique nodes in the sample, we produce a size estimate. Then, we show analytically that the estimate error vanishes with high probability for smaller number of samples than those required by prior-art algorithms. Moreover, although our algorithms are provably correct for any graph, they excel when applied to social network-like graphs. The proposed algorithms were evaluated on synthetic as well real social networks such as Facebook, IMDB, and DBLP. Our experiments corroborated the theoretical results, and demonstrated the effectiveness of the algorithms.

Bio:
Liran is a Research Engineer in Yahoo! Israel Labs. He received his B.A., M.A., and Ph.D. degrees in computer science from the Technion–Israel Institute of Technology, Haifa, Israel, in 2001, 2005, and 2008 respectively. His Ph.D. thesis deals with scheduling algorithms in advanced wireless networks. His current main research interests are algorithmic aspects of Web technologies. Liran is the co-author of several refereed journal and conference papers.

25/5/2011CS: Taub 3

Orleans: A Programming Model for the Cloud

Dr. Gabriel Kliot

eXtreme Computing Group, Microsoft Research

What if you could build the next Facebook or Twitter with just a few hundred
lines of code ? and scale it to hundreds of millions of users across thousands
of servers, right out of the box? Orleans is a cloud programming model and
runtime from Microsoft Research, which can make this the new “norm”.

Orleans is a software framework for building ‘client + cloud’ applications.
It encourages the use of simple concurrency patterns that are easy to
understand and implement correctly, with an actor-like programming model for
distributed applications. It uses declarative specification of persistence,
replication, and consistency properties, as well as lightweight transactions
to support the development of reliable and scalable ‘client + cloud’
applications.

Bio: Gabriel Kliot a Research Software Development Engineer in the eXtreme
Computing Group, Microsoft Research. He’s been working on Orleans since its
early days. Gabriel also leads project Horton: a distributed database for
managing and querying large distributed graphs. Gabriel obtained his PhD in
Computer Science from the Technion in 2009, where his thesis focused on
Probabilistic Middleware Services for Wireless Mobile Ad-Hoc Networks, as
well as other various aspects of Distributed Systems. In his spare time
Gabriel is a competitive long-distance runner.

18/5/2011EE: Meyer 861

How Secure are Secure Internet Routing Protocols?

Prof. Sharon Goldberg

Computer Science Department at Boston University

A decade of research has been devoted to addressing vulnerabilities in
global Internet routing system. The result is a plethora of security
proposals, each providing a different set of security guarantees. To
inform decisions about which proposal should be deployed in the
Internet, we present the first side-by-side quantitative comparison of
the major security variants. We evaluate security variants on the
basis of their ability to prevent one of the most fundamental forms of
attack, where attacker manipulates routing messages in order to
attract traffic to a node it controls (so that it can tamper, drop, or
eavesdrop on traffic). We combine a graph-algorithmic analysis with
simulations on real network data to show that prior analysis has
underestimated the severity of attacks, even when the strongest known
secure routing protocol is fully deployed in the network. We find that
simple access control mechanisms can be as effective as strong
cryptographic approaches, and it is really the combination of these
two competing approaches that leads to a significant improvement in
security. Time permitting, we will also discuss some of the economic
and engineering issues that must be addressed before any of these
proposals can realistically be deployed in the Internet.

Based on joint work with Michael Schapira, Pete Hummon, Jennifer
Rexford and Phillipa Gill.

Bio:
Sharon Goldberg is an assistant professor of computer science at
Boston University. She received her PhD from Princeton in 2009, and
her BASc from the University of Toronto in 2003. She has worked as a
network engineer at Bell Canada and Hydro One Networks, and as a
researcher at Microsoft, Cisco, and IBM. Her research focuses on
finding practical solutions to problems in network security, by
leveraging formal techniques from cryptography and game theory.

11/5/2011CS: Taub 3

A Shape Analysis for Optimizing Parallel Graph Programs

Dr. Roman Manevich

University of Texas at Austin

In the first part of the talk I will give a high-level
introduction to Galois, a framework for designing and
implementing parallel graph algorithms and related concepts.
I will explain what are ordered/unordered graph algorithms and
how optimistic parallelism is achieved using transactional
boosting.

In the second part of the talk I will describe a new shape
analysis, which is used for analyzing graph algorithms written
with Galois. The shape analysis infers properties that can be
used to optimize the graph algorithm by reducing two of the
main overheads of the system — synchronization overheads and
rollback logging overheads, We implemented the analysis on top
of TVLA and applied it to several graph algorithms. The
optimized (parallel) graph algorithms run 2x to 12x faster
than the unoptimized algorithms.

This talk is based on joint work with Dimitrios Prountzos,
Keshav Pingali, and Kathryn McKinley, presented at POPL 2011.

Bio: Roman Manevich is a post-graduate student at the
University of Texas at Austin working with Prof. Keshav
Pingali on programming languages aspects of parallelizing
graph algorithms. Previously he spent a year as a post-doc at
the University of California Los Angeles, working with
Prof. Rupak Majumdar on verifying correctness of transactional
memories.. His Research interests are primarily in the area
of formal verification and synthesis of concurrent programs.
He obtained his PhD in Computer Science from Tel-Aviv
University in 2009. His thesis focused on verification of
sequential and concurrent programs with dynamic memory
allocation.

5/5/2011 14:30CS: Taub 337

How to win Friends and Influence People, Truthfully

Yaron Singer

Computer Science Division, University of California at Berkeley

Throughout the past decade there has been extensive research
on algorithmic and data mining techniques for solving the
problem of influence maximization in social networks: if one
can convince a subset of individuals to influence their
friends to adopt a new product or technology, which subset
should be selected so that the spread of influence in the
social network is maximized?

Despite the progress in modeling and techniques, the
incomplete information aspect of problem has been largely
overlooked. While the network structure is often available,
the inherent cost individuals have for influencing friends is
difficult to extract.

In this talk we will discuss mechanisms that extract
individuals’ costs in well studied models of social network
influence. We follow the mechanism design framework which
advocates for truthful mechanisms that use allocation and
payment schemes that incentivize individuals to report their
true information. Beyond their provable theoretical
guarantees, the mechanisms work well in practice. To show this
we will use results from experiments performed on the
mechanical turk platform and social network data that provide
experimental evidence of the mechanisms’ effectiveness.

4/5/2011 12:30EE: Meyer 1003

Dr. Philip M. Merlin Memorial Lecture and Prize Award
Search Flavors – Trends and Opportunities

Prof. Yossi Matias

Senior Director and Head of Israel R&D Center, Google

This talk will discuss some recent developments in search, emerging in various shapes and forms.
We will highlight some challenges, and point to some search trends that play an increasing role in multiple domains.
We will also discuss the power of data and the significant role of cloud technologies in facilitation of new opportunities.
Some of the core technologies and global innovations developed in Google’s R&D center in Israel will be highlighted.

Bio:
Yossi Matias is the director of the Google Tel Aviv R&D Center. He joined
Google in August 2006 to build and lead the center, and is responsible for the
strategy and operation of the center.

Prof. Matias is a faculty member of the School of Computer Science at Tel Aviv
University (on leave), having joined the faculty in 1997 as a recipient of the
prestigious Alon Fellowship. He was recently a visitor at Stanford Univ
(2004-2006). Prior to his position at Tel Aviv University, he was a Research
Scientist at Bell Laboratories (1992-1997). He received his Ph.D. with distinction
from Tel Aviv University (1992), M.Sc. from the Weizmann Institute (1987) (thesis
with excellence), and B.Sc. (cum laude) from Tel Aviv University (1985). Previously
he served in the Israeli IDF as an Israeli Air Force pilot.

Matias authored over 100 scientific and technological research papers and is the
inventor of over 20 patents. His research and expertise span many diverse areas,
and range from highly theoretical works to works involving applied technology and
extensive system implementation. His research and technology developments and
innovations include: data analysis, algorithms for massive data sets, data streams,
data synopses, parallel computation, data compression, data and information management
systems, data security and privacy, video scrambling, and Internet technologies. He is
a frequent speaker and was on the program committees of numerous international
scientific and technology conferences. He is on the founding steering committee of
the Israeli Academic Grid and was a member of a national committee evaluating
strategies for Grid technologies.

In addition to his academic and scientific work, for over a decade Matias has been
heavily involved in the high tech industry and in technology and product development.
He founded in Bell Labs the Lucent Personalized Web Assistant project (1996), developing
one of the early Internet privacy and anti-spam technologies (sold by Lucent to a CMGI
company); he co-founded and led Zapper Technologies (1999), developing advanced contextual
and personalized search technologies; he was the CTO and Chief Scientist of Hyperroll,
leading the technology strategy of its high performance data aggregation software that is
at the core of data warehouses and Business Intelligence applications of multiple Fortune
100 enterprises. Matias has served as a consultant and adviser to many technology
companies, including public and startup companies, and to venture capital firms.

Yossi Matias is a recipient of the 2005 ACM-EATCS Godel prize in Theoretical Computer
Science “for the profound impact on the theory and practice of the analysis of data
streams”. His work was instrumental in setting up a scientific field of algorithmics for
massive data sets, and technologies now extensively used in industries involving massive
data sets.

27/4/2011CS: Taub 3

Side Channels in Cloud Services: The Case of Deduplication in Cloud Storage

Prof. Benny Pinkas

Bar-Ilan University

The talk will discuss deduplication, a form of compression in
which duplicate copies of files are replaced by links to a
single copy. Deduplication is known to reduce the space and
bandwidth requirements of Cloud storage services by more than
90%, and is most effective when applied across multiple
users.

We study the privacy implications of cross-user
deduplication. We demonstrate how deduplication can be used as
a side channel which reveals information about the contents of
files of other users, as a covert channel by which malicious
software can communicate with its control center, or as a
method to retrieve files about which you have only partial
information.

Due to the high savings offered by cross-user deduplication,
cloud storage providers are unlikely to stop using this
technology. We therefore propose mechanisms that enable
cross-user deduplication while ensuring meaningful privacy
guarantees.

Bio: Benny Pinkas is an associate professor in the
department of Computer Science at Bar Ilan University,
Israel. He received
his PhD from the Weizmann Institute of Science in
2000. Following that worked in the research labs
of Intertrust technologies and Hewlett-Packard, and was a
member of the faculty in the department of Computer Science of
the University of Haifa. His research interests include
cryptography, privacy, and computer and communications
security.

13/4/2011CS: Taub 3

Abstraction-Guided Synthesis of Synchronization

Dr. Eran Yahav

Faculty Member, Computer Science Dept., Technion

In this talk I will present a framework for synthesizing
efficient synchronization in concurrent programs, a task known
to be difficult and error-prone when done manually. The
framework is based on abstract interpretation and can infer
synchronization for infinite state programs. Given a program,
a specification, and an abstraction, we infer synchronization
that avoids all (abstract) interleavings that may violate the
specification, but permits as many valid interleavings as
possible. I will show application of this general idea for
automatic inference of atomic sections and memory fences in
programs running over relaxed memory models.

Joint work with Michael Kuperstein, Martin Vechev, and Greta
Yorsh.

Bio: Eran Yahav is a faculty member of the Computer
Science department at the Technion, Israel. Prior to his
position at Technion, he was a Research Staff Member at the
IBM T.J. Watson Research Center in Hawthorne, New York
(2004-2010). He received his Ph.D. from Tel Aviv University
(2004) and his B.Sc. (cum laude) from the Technion-Israel
Institute of Technology (1996). His research interests
include static and dynamic program analysis, program
synthesis, and program verification.

Eran is a recipient of the prestigious Alon Fellowship for
Outstanding Young Researchers, and the Andre Deloro Career
Advancement Chair in Engineering.

6/4/2011EE: Meyer 861

Fat-Trees Routing and Node Ordering Providing Contention Free Traffic for MPI Global Collectives

Eitan Zahavi

Senior Director of Engineering, Mellanox Technologies LTD, Israel.

As the size of High Performance Computing clusters grows, the increasing probability of
interconnect hot spots degrades the latency and effective bandwidth the network
provides. This paper presents a solution to this scalability problem for real
life constant bisectional-bandwidth fat-tree topologies. It is shown that maximal
bandwidth and cut-through latency can be achieved for MPI global collective
traffic. To form such congestion-free configuration, MPI programs should
utilize collective communication, MPI-node-order should be topology aware, and the
packets routing should match the MPI communication patterns. First, we show that MPI
collectives can be classified into unidirectional and bidirectional shifts. Using this
property, we propose a scheme for congestion-free routing of the global collectives in fully
and partially populated fat trees running a single job. Simulation results of the
proposed routing, MPI-node-order and communication patterns show a 40% throughput
improvement over previously published results for all-to-all collectives.

Bio:
Senior Director of Engineering, Mellanox Technologies LTD, Israel.

A cofounder of Mellanox, served as Mellanox’s Software Architect since October 2005, and as
Director of Design Technologies since May 1999. Today Eitan leads the Design Automation group
and researches network architectures. As part of his netwrok architecture role he is a co-chair
of the IBTA Management working group, and focuses on cluster topologies, congestion control,
deterministic routing and adaptive routing technologies.

Eitan is also an M.Sc. student of Avinoam Kolodny and Israel Cidon at the Technion EE department
studying loss-less networks in and out of the chip.

Prior to joining Mellanox, from March 1992 to April 1999, Mr. Zahavi served in various Chip
Design and Design Automation and EDA positions at Intel in both the Micro Processors group and
Intel’s Strategic CAD Lab.

Mr. Zahavi holds a Bachelor of Science in Electrical Engineering at the Technion (Israel’s
Institute for Technology) since 1987.

30/3/2011CS: Taub 3

IBM Watson and the Jeopardy! Challenge

Dr. David Carmel and Dr. Dafna Sheinwald

IBM Research — Haifa

Over the last days, millions of viewers witnessed computing
history being made as IBM’s Watson question answering system
defeated Jeopardy! quiz show champions Brad Rutter and Ken
Jennings. Watson is an application of advanced natural
language processing, information retrieval, knowledge
representation and reasoning, and machine learning
technologies to the field of open domain question answering.
Watson runs on a cluster of 90 IBM Power 750 servers in 10
racks with a total of 2880 POWER7 processor cores and 16
Terabytes of RAM, enabling it to respond within less than 3
seconds. The deep analytic capabilities and systems
technologies that enabled Watson to win at Jeopardy! now can
be customized for specific industry lexicons and applications
to help humans make smarter, faster and more effective
decisions.

In this talk we will tell more about the challenge, Watson’s
architecture and technologies, our contributions to the
system, and some insights from the man vs. machine match.

Bios:

David Carmel is a Research Staff Member at the Information
Retrieval group at IBM Haifa Research Lab. David’s research is
focused on search in the enterprise, query performance
prediction, social search, and text mining. David has
published more than 80 papers in Information retrieval and Web
journals and conferences, and serves in the Program Committee
of many conferences and workshops.

https://researcher.ibm.com/researcher/view.php?person=il-CARMEL

Dafna Sheinwald is a researcher at the Information Retrieval
group of IBM Haifa Research Lab, focusing mainly on different
aspects of search in the enterprise: various sources of data,
similar data repeating in different sites, combination of
internal and external data, and system configurations
addressing load problems. Dafna has expertise also in Data
Compression and in algorithms and data structures. She often
serves in the committee of the Data Compression Conference.

16/3/2011EE: Meyer 861

Causality, Knowledge and Coordination in Distributed Systems

Ido Ben-Zvi

Technion, Electrical Engineering Department

Coordinating the proper ordering of events across remote sites
is a central task of distributed applications. In asynchronous
systems, such coordination depends in an essential way upon
message chains, as captured by Lamport’s happened-before
relation. The relation provides a useful approximation of
causality, in the sense that in asynchronous systems two event
can only be causally related if they are Lamport related.

The talk will consider coordination and causality in
synchronous systems, where upper bounds are available on
message transmission times over each channel, and processes
share a global clock. Here, active coordination is required
whenever a spontaneous external input is meant to trigger an
ordered sequence of responses across various sites. We capture
the essence of such a coordination task in a proposed class of
coordination problems called Ordered Response. Within this
framework we embark on a search for a similar notion of
causality. We will not touch upon knowledge in any great
depth, but consider that in a synchronous setting both message
chains and the passage of time can be used to spread
information across the system, and hence to enable
coordination.

Indeed, it turns out that the synchronous analog for Lamport’s
causality is a structure that carefully combines message
chains and the existing upper bounds on transmission
times. This causal structure, called the centipede, is shown
to be necessary in every solution of the Ordered Response
problem.

9/3/2011EE: Meyer 1061

Portability and Performance of Applications in the Manycore Era

Dr. Erven Rohou

INRIA Rennes, ALF Research Group

For many years, the frequency of microprocessors has grown at an
exponential pace (from 20 MHz in 1990 to 2 GHz in 2000). Since 2002 however, despite
considerable effort, the frequency has been plateauing. Advances in
technology and micro-architecture now translate into more parallelism. The
consequences on the software industry are dramatic: most existing software has been
designed with a sequential model in mind, and even parallel applications
contain residual sequential sections. The performance of these applications
will ultimately be driven by the performance of the sequential part, as
stated by Amdahl’s law.

For this reason we envision that future architectures will consist of many
simple cores, but also a few very complex cores. The performance of
applications is unlikely to automatically increase with new generations of
future processors, as it used to happen in the past. Even plain portability
is at risk for applications developed with a specific model of parallelism in
mind. This talk proposes directions to address the problem of portability
and performance of applications on future manycore processors, taking into
account both parallel and residual sequential code sections.

Talk slides.

2/3/2011CS: Taub 201

IBM’s PowerEN Developer Cloud: Fertile Ground for Academic Research

Dr. Amit Golander

IBM Haifa Research Lab

IBM’s newest technology, the Power Edge of Network (PowerEN)
processor, merges network and server attributes
to create a new class of wire-speed processor. PowerEN is a
hybrid computer that employs: massive multithreading capabilities,
integrated I/O and unique special-purpose accelerators
for compression, cryptography, pattern matching, XML and
Network processing.

As a novel architecture, the PowerEN processor offers fertile
ground for research. It can facilitate the development of
faster applications in many fields of computer science such as
Networking, Cryptography, Virtualization, Bioinformatics, SOA,
and Systems. IBM encourages academic research and has set
up the “PowerEN Developer Cloud” infrastructure to allow it.
Israeli universities are of the first to perform research on this
global infrastructure.

26/1/2011EE: Meyer 861

Task Superscalar Multiprocessors

Dr. Yoav Etsion

Barcelona Supercomputing Center

Parallel programming is notoriously difficult and is still considered
an artisan’s job. Recently, the shift towards on-chip parallelism has
brought this issue to the front stage. Commonly referred to as the
Programmability Wall, this problem has already motivated the
development of simplified parallel programming models, and most notably
task-based models.

In this talk, I will present Task Superscalar Multiprocessors,
a conceptual multiprocessor organization that operates by dynamically
uncovering task-level parallelism in a sequential stream of
tasks. Task superscalar multiprocessors target an emerging class of
task-based dataflow programming models, and thus enables programmers to
exploit manycore systems effectively, while simultaneously simplifying
their programming model.

The key component in the design is the Task Superscalar Pipeline,
an abstraction of instruction-level out-of-order pipelines that operates
at the task-level and can be embedded into any manycore fabric to
manage cores as functional units. Like out-of-order pipelines that
dynamically uncover parallelism in a sequential instruction stream and
drive multiple functional units, the task superscalar pipeline
uncovers task-level parallelism in a stream of tasks generated by a
sequential thread. Utilizing intuitive programmer annotations of task
inputs and outputs, the task superscalar pipeline dynamically detects
inter-task data dependencies, identifies task-level parallelism, and
executes tasks out-of-order. I will describe the design of the task
superscalar pipeline, and discuss how it tackles the scalability
limitations of instruction-level out-of-order pipelines.

Finally, I will present simulation results that demonstrate the design
can sustain a decode rate faster than 60ns per task and dynamically
uncover data dependencies among as many as ~50,000 in-flight tasks,
using 7MB of on-chip eDRAM storage. This configuration achieves
speedups of 95-255x (average 183x) over sequential execution for nine
scientific benchmarks, running on a simulated multiprocessor with 256
cores.

19/1/2011Meyer 1061

DNS Poisoning – and Antidotes

Prof. Amir Herzberg

Computer Science Department, Bar-Ilan University

DNS poisoning is a well known and critical threat to Internet
security. We will review the important recent results in this area,
mainly, Kaminsky’s devastating attack, and the widely-adopted
countermeasures such as port randomization and 0x20 encoding.

We then present trap-then-poison attacks, which circumvent
port-randomization, when a local resolver is located behind a NAT
device that assigns random ports to each outbound packet, and
predict-then-poison attacks, assuming ports are assigned
consecutively-from-random but allowing even weaker attackers. We
tested our attacks against the popular Linux Netfilter NAT. If time
permit, we may also discuss new defenses we recently developed.

Joint work with Haya Shulman.

12/1/2011EE: Meyer 861

Distributed peta-scale data transfer

Jesse Alpert

Google

Many internal Google systems need to copy huge volumes of data between
Google’s data centers. Solving this problem requires managing very
high traffic volumes, dealing effectively with packet loss, scheduling
and prioritizing the use of constrained resources and avoiding a
single point of failure.

Effingo, Google’s internal large scale copy service, is designed to
scale and sustain very high traffic throughput.
Effingo features the following characteristics:
o A custom network protocol over UDP.
o Distributed scheduling decisions when managing the use of
constrained resources.
o Multi-point topological efficiency.
o High availability even when clusters are down, or slow.
o Scalability to many files, many users and to very high throughput globally.

In this talk I’ll describe the challenges of peta-scale data transfer,
and how they are addressed using Effingo. This is a joint work with
many other Google engineers and researchers.

5/1/2011EE: Meyer 861

Efficient Cloud Operations via Job-Migration and Pricing

Dr. Ishai Menache

Microsoft Research New England

Cloud computing technology offers on-demand network access to shared
pools of configurable computing resources, such as virtual servers,
applications and software services. This paradigm delivers the economy
of scale to users via large datacenters, fast and flexible provisioning
of resources, and the freedom from long-term investments in equipment.
Despite the potential advantages, a new set of challenges must be
overcome before cloud providers can see a return on their investments;
namely, dealing with the uncertainty involved in operating cloud systems,
both in terms of supply (time-varying energy prices and availability) and
demand for computing resources.

One way to reduce operation costs is by exploiting the spatiotemporal
variation of electricity prices. I start by showing how techniques from
online algorithms can be employed to develop algorithms that move
computation to datacenters in which energy is available at a cheaper
price. The algorithm we design performs remarkably well on empirical
electricity-prices and job-demand data. In the second part of the talk,
I develop a general queuing-based model for the study of cloud computing
economics, incorporating stochastic demands and diversity in the
application types and user preferences. Despite the high level of
heterogeneity, I show that a simple per-unit price (currently employed by
most cloud platforms) suffices to induce social efficiency. I will
conclude with a discussion of ongoing work on combining energy-aware
scheduling and pricing for optimizing profits.

29/12/2010EE: Meyer 861

Large scale iterative computing using GraphLab

Dr. Danny Bickson

Machine Learning Department, Carnegie Mellon

As the available computing power around us keeps growing
(clouds, clusters, GPUs etc.) and the amount of collected data is
increasing, new challenges arise when designing algorithms for
handling the data. In this talk, I will cover my work concerning both
the theoretical and the practical aspects of parallel and distributed
large scale computing on massive datasets. I will discuss the design of
efficient iterative algorithms to be used in large scale
settings, as well as actual algorithm implementation on different
distributed platforms such as multicore, computer clusters and GPUs.

As a case study I will relate to the problem of modeling network
flows, which tend to exhibit heavy tailed behavior. I will describe
the theory developed for computing efficient inference in linear
heavy tailed probabilistic graphical models: for the first time
a method for computing exact inference in this model, where
state-of-the-art approaches are using
only approximations. We further design efficient iterative algorithms
for inference and analyze their properties.

From the practical side, I will present the design and
implementation of the GraphLab framework, a parallel computing
abstraction for iterative sparse graph algorithms. The GraphLab
framework is used to rapidly prototype a distributed implementation
of the developed iterative inference algorithms, hiding
implementation details like caching, load balancing, data partitioning
and scheduling. I will demonstrate the applicability of GraphLab to
the developed inference techniques using real network data obtained
from the PlanetLab network. Experimental results demonstrate high
scalability without compromising the solution accuracy. I will cover
briefly other interesting applications as parallel Lasso on GPUs and
distributed matrix factorization.

15/12/2010
Meyer 1061

Stealing Reality – the New Menace, Already Here

Dr. Yaniv Altshuler

Ben Gurion University, Deutsche Telekom Laboratories

In this talk we will discuss the threat of malware targeted at extracting
information about the relationships in a real-world social network as well
as characteristic information about the individuals in the network, which we
dub “Stealing Reality”. We present Stealing Reality (SR), explain why it
differs from traditional types of network attacks, and discuss why its
impact is significantly more dangerous than that of other attacks. We also
present our initial analysis of this attack, based on real-world mobile
network data, and show how in many scenarios such attacks will be very
difficult (if not impossible) to detect using currently available network
monitoring systems.

1/12/2010EE: Meyer 861

Caching for Realtime Search

Dr. Eddie Bortnikov

Yahoo! Research

Modern search engines feature real-time indices, which incorporate changes
to content within seconds. As search engines also cache search results for
reducing user latency and back-end load, without careful real-time management
of search results caches, the engine might return stale search results to
users despite the efforts invested in keeping the underlying index up to date.
We propose an architectural component called CIP – the cache invalidation
predictor. CIPs invalidate supposedly stale cache entries upon index
modifications. We propose CIP heuristics, and evaluate them in an authentic
environment (a real evolving corpus and query stream of a large commercial
news search engine. We show that a classical cache replacement policy, LRU,
completely fails to guarantee freshness over time, whereas our CIPs serve 97%
of the queries with fresh results. Our policies incur a negligible impact on
the baseline’s cache hit rate, in contrast with traditional age-based
invalidation, which must severely reduce the cache performance in order to
achieve the same freshness. We demonstrate that the computational overhead of
our algorithms is minor, and that they even allow reducing the cache’s memory
footprint.

17/11/2010EE: Meyer 861

Transactional Memory Today

Prof. Maurice Herlihy

Brown University, Computer Science Department

“Transactional Memory” replaces conventional multiprocessor locks, monitors,
and conditions with a transactional model of synchronization. The original
proposal dates back to 1993, but even today, there is a vigorous debate about
what it means and what it should do. This debate sometimes generates more heat
than light: terms are not always well-defined and criteria for making judgments
are not always clear.

This talk will try to impose some order on the conversation. TM itself
can encompass hardware, software, speculative lock elision, and other
mechanisms. The benefits sought encompass simpler implementations of
highly-concurrent data structures, better software engineering for
concurrent platforms and enhanced performance. These goals are
distinct, and sometimes in conflict.

3/11/2010EE: Meyer 861

The Cost of Privatization

Prof. Hagit Attiya

Technion, Computer Science Department

Software transactional memory (STM) guarantees that a transaction, consisting of
a sequence of operations on the memory, appears to be executed atomically. In
practice, it is important to be able to run transactions together with non-transactional
legacy code accessing the same memory locations, by supporting privatization. This
should be provided without sacrificing the parallelism offered by today’s multicore
systems and multiprocessors.

This paper proves an inherent cost for supporting privatization, which is linear in
the number of privatized items. Specifically, we show that a transaction privatizing
k items must have a data set of size at least k, in an STM with invisible reads, which
is oblivious to different non-conflicting executions and guarantees progress in such
executions. When reads are visible, it is shown that Omega(k) memory locations must be
accessed by a privatizing transaction, where k is the minimum between the number of
privatized items and the number of concurrent transactions guaranteed to make progress,
thus capturing the trade off between the cost of privatization and the parallelism
offered by the STM.

This is joint work with Eshcar Hillel (CS, Technion)

27/10/2010Meyer 1061

Sparse Reliable Graph Backbones

Dr. Yuval Emek

Microsoft Israel R&D and the School of Electrical Engineering, Tel Aviv University

In this talk we discuss the notion of network reliability:
Given a connected graph G and a failure probability p(e) for each edge
e in G, the reliability of G is the probability that G remains connected
when each edge e is removed independently with probability p(e).

I will show that every n-vertex graph contains a spanning subgraph with
O(n \log n) edges whose reliability is very close to that of G. Our proof
is based on a polynomial time randomized algorithm for constructing this sparse
subgraph.

The talk will be self contained.

Based on a joint work with Shiri Chechik, Boaz Patt-Shamir, and David Peleg.

20/10/2010EE: Meyer 861

The Turtles Project: Design and Implementation of Nested Virtualization

Muli Ben-Yehuda

IBM Haifa Research Lab

In classical machine virtualization, a hypervisor runs multiple
operating systems simultaneously, each on its own virtual
machine. In nested virtualization, a hypervisor can run multiple
other hypervisors with their associated virtual machines. As
operating systems gain hypervisor functionality—Microsoft Windows
7 already runs Windows XP in a virtual machine—nested
virtualization will become necessary in hypervisors that wish to
host them.

We present the design, implementation, analysis, and evaluation of
high-performance nested virtualization on Intel x86-based
systems. The Turtles project, which is part of the Linux/KVM
hypervisor, runs multiple unmodified hypervisors (e.g., KVM and
VMware) and operating systems (e.g., Linux and Windows). Despite the
lack of architectural support for nested virtualization in the x86
architecture, it can achieve performance that is within 6-8% of
single-level (non-nested) virtualization for common workloads,
through multi-dimensional paging for MMU virtualization and
multi-level device assignment for I/O virtualization.

Joint work with M. D. Day, Z. Dubitzky, M. Factor, N. Har’El,
A. Gordon, A. Liguori, O. Wasserman, and B.-A. Yassour.

23/6/2010EE: Meyer 861

Utility Dependence in Correct and Fair Rational Secret Sharing

Prof. Yehuda Lindell

Department of Computer Science, Bar Ilan University

The problem of carrying out cryptographic computations when the participating parties are
rational in a game-theoretic sense has recently gained much attention. One problem that has
been studied considerably is that of rational secret sharing. In this setting, the aim is to
construct a mechanism (protocol) so that parties behaving rationally have incentive to
cooperate and provide their shares in the reconstruction phase, even if each party prefers
to be the only one to learn the secret.

Although this question was only recently asked by Halpern and Teague (STOC 2004), a number of
works with beautiful ideas have been presented to solve this problem. However, they all have
the property that the protocols constructed need to know the actual utility values of the
parties (or at least a bound on them). This assumption is very problematic because the
utilities of parties are not public knowledge. We ask whether this dependence on the actual
utility values is really necessary and prove that in the basic setting, rational secret
sharing cannot be achieved without it. On the positive side, we show that by somewhat relaxing
the standard assumptions on the utility functions, it is possible to achieve utility
independence. In addition to the above, we observe that the known protocols for rational secret
sharing that do not assume simultaneous channels all suffer from the problem that one of the
parties can cause the others to output an incorrect value. (This problem arises when a party
gains higher utility by having another output an incorrect value than by learning the secret
itself; we argue that such a scenario is not at all unlikely.) We show that this problem is
inherent in the non-simultaneous channels model, unless the actual values of the parties’
utilities for this attack is known, in which case it is possible to prevent this from happening.

Joint work with Gilad Asharov (presented at CRYPTO 2009).

2/6/2010EE: Meyer 861

Introduction to NoSQL and Cassandra

Ran Tavory

Outbrain

NoSQL (Not Only SQL) is a movement promoting a loosely defined class of non-relational
data stores that break with a long history of relational databases. These data stores
may not require fixed table schemas, usually avoid join operations and typically scale
horizontally. Academics and papers typically refer to these databases as structured
storage.

Notable production implementations include Google’s BigTable and Amazon’s Dynamo.
However, there are also many publicly available open source variants including HBase
and Cassandra.

In the talk I’ll be giving an introduction to nosql (Not Only SQL) and Cassandra, an
open-source implementation of a column oriented data store originally developed at
facbook which relies upon Google’s BitTable model as well as Amazon’s Dynamno system.

Talk slides.

12/5/2010EE: Meyer 861

Reliable data sharing using unreliable storage

Alex Shraer

Tehcnion, Electrical Engineering Department

Distributed storage architectures provide a cheap and scalable
alternative to expensive monolithic disk array systems currently used
in enterprise environments. Such distributed architectures make use of
many unreliable servers (or storage devices) and provide reliability
through replication. The large number of fault-prone servers requires
supporting dynamic changes when faulty servers are removed from the
system and new ones are introduced. In order to maintain reliability
when such changes occur and to avoid “split-brain” behavior, it is
essential to ensure proper coordination, e.g., when multiple such
changes occur simultaneously. Existing solutions are either
centralized, or use strong synchronization algorithms (such as
consensus) among the servers to agree on every change in the system.
In fact, it was widely believed that reconfigurations require
consensus and just like consensus cannot be achieved in asynchronous
systems. In this presentation I will refute this belief and present
DynaStore, an asynchronous and completely decentralized
reconfiguration algorithm for read/write storage.

Cloud storage is another setting where reliability is a challenge.
While storing data online becomes increasingly popular, repeated
incidents show that clouds fail in a variety of ways. Yet, clients
must currently trust cloud providers to handle their information
correctly, and do not have tools to verify this. I will present Venus,
a system that guarantees data consistency and integrity to clients
that collaborate using commodity cloud storage, and alerts clients
when the storage is faulty or malicious (e.g., as a result of a
software bug, misconfiguration, or a hacker attack). Venus does not
require trusted components or changes to the storage provider.

This work is a collaboration with researchers from IBM Research –
Zurich and Microsoft Research Silicon Valley.

3/05/2010 14:30Meyer 1003

Dr. Philip M. Merlin Memorial Lecture and Prize Award
Entrepreneurship, innovation, achievements and preserving them

Dr. Orna Berry

Venture Partner, Gemini Israel Funds

Israel GDP has doubled during the 90s and its ICT (Information and Communications
Technologies) based industries have six-tupled during the same period. Now the world has
adopted innovations and knowledge and means of growth and equality and the state of Israel
needs to sharpen its policies in order to retain competitiveness and growth as well as broaden
the economic benefit of its might.

28/4/2010EE: Meyer 861

Fast CDN (Content Delivery Network)

Eyal Zohar

Tehcnion, Electrical Engineering Department

In this talk we will go through several facts and challenges related to
high speed content delivery in client-server model.

The presentation is based on the experience that was gained during the
development and implementation of a cache system for large video files
over P2P file sharing and HTTP (like YouTube).

Agenda: which protocols are used in the Internet today, disk reading speed,
networking with multi-core CPU, scalable CDN, inline processing of heavy
traffic, and more.

14/4/2010EE: Meyer 861

Continuous Consensus

Prof. Yoram Moses

Tehcnion, Electrical Engineering Department

This talk presents the continuous consensus primitive in synchronous
distributed systems. Continuous consensus is a general tool for
simultaneous coordination. Continuous consensus is described and
its applications are illustrated. We then survey the problem of
implementing continuous in a number of different settings. We show
cases in which the problem can efficiently be solved, ones in which it
becomes intractable, and discuss self-stabilizing solutions to
continuous consensus. Continuous consensus is closely related to the
state of common knowledge, and this connection is effectively used in
its analysis. The talk will be self-contained and will not assume
familiarity with knowledge theory or related terms.

This talk describes joint work with Mark Tuttle, Tal Mizrahi, Ezra Hoch
and Danny Dolev.

24/3/2010EE: Meyer 861

Building Robust Nomadic Wireless Mesh Networks Using Directional Antennas

Dr. Yigal Bejerano

Bell Laboratories, Alcatel-Lucent

Nomadic wireless mesh networks (NWMNs) are currently used for
military applications and fast recovery networks. In such systems,
wireless routers, termed nodes, are mounted on top of vessels that
may change their locations and the nodes are required to establish
a broadband and reliable wireless mesh network. For improving network
performance, some vendors use directional antennas and the mesh
topology comprises of point-to-point (P2P) connections between
adjacent nodes. Consequently, the number of P2P connections of a node
is upper-bounded by the number of directional radios (and antennas)
that it has, which is typically a small constant. This raises the
need to build robust (i.e., two-node/edge-connected) mesh networks
with a bounded node degree, regardless of the nodes’ locations. In
the talk, I will describe simple and efficient schemes for
constructing robust wireless mesh networks with small node degree.

(Joint work with Qunfeng Dong)

17/3/2010EE: Meyer 861

Noninterference in the Presence of Colluding Adversaries

Dr. Kai Engelhardt

School of Computer Science and Engineering, UNSW

Whether adversaries can glean information from a distributed
system in a formal sense hinges on the definition of such a
system and what can be observed by those adversaries. In the
presence of colluding adversaries, the standard definition of
noninterference by Goguen and Meseguer and its many variants
proposed in the literature fail in a very intuitive sense to
capture a simple collusion attack. The crucial difference
between what is modelled in those definitions and what we argue
needs to be modelled is that teams can observe pomsets as
Plotkin and Pratt stated. In this talk we expose what goes
wrong in the known approaches and why we should care.

8/3/2010 10:30EE: Meyer 861

Innovative companies and cloud computing

Dr. Aake Edlund

PDC Center for High Performance Computing, KTH

To maintain a competitive advantage through innovation, companies of
today must handle increasingly dynamic environments and increasingly
rapid innovation cycles. Cloud computing is addressing many of these
challenges, especially the possibility of rapid and cost-efficient
prototyping and scaling.
Flexibility is the key feature addressed in this talk illustrated with
examples from the presenter’s own experiences in startups and larger
international enterprises.

3/2/2010EE: Meyer 861

An online high performance computing service for genetic linkage analysis

Mark Silberstein

Technion, Computer Science Department

In this talk I will describe the algorithms and mechanisms underlying
a distributed system for genetic linkage analysis, called Superlink-online.
It is a production online system which serves hundreds of geneticists worldwide
allowing for faster analysis of genetic data via automatic parallelization and
execution on thousands of non-dedicated computers.
I will describe the following innovative technologies forming the core of this system.

1. Practical scheduling and execution of embarrassingly parallel Bags of Tasks
in multiple non-dedicated computing environments (SC09).
Our approach allows for virtualization of multiple grids, clouds and volunteer
gids as a single computing platform by building an overlay of execution clients
over the physical resources; another component is a generic mechanism for
dynamic scheduling policies to reduce the turnaround time
in the presence of resource failures and heterogeneity. Our system has executed
hundreds of Bags of Tasks with over 9 million jobs during 3 months alone; these
have been invoked on 25,000 hosts from the local clusters, the Open Science Grid,
EGEE, UW Madison pool and Superlink@Technion community grid.

2. A general technique for designing memory-bound algorithms on GPUs through
software-managed cache (ICS08).
This technique was successfully applied to the probabilistic network inference
yielding an order of magnitude performance improvement versus
the performance without such a cache. Overall we achieved up to
three orders of magnitude speedup when executing our GPU-based
algorithm versus single CPU performance.

3. Coarse- and fine-grained parallel algorithms for the inference in probabilistic
networks on large-scale non-dedicated environments and GPUs.
We devised and implemented an algorithm suitable for loosly coupled environments
with unreliable resources (American Journal of Human Genetics 2006, HPDC06)
and adapted it for heterogeneous GPU-CPU supercomputer TSUBAME in Tokyo Tech.

20/1/2010
Meyer 1003

Computer Science Meets Game Theory: Implementing Mediators Robustly

Prof. Joe Halpern

Cornell University, Computer Science Department

The question of whether a problem in a multiagent system that can be
solved with a trusted mediator can be solved by just the agents in the
system, without the mediator, has attracted a great deal of attention in
both computer science (particularly in the cryptography community) and
game theory. In cryptography, the focus on the problem has been on
secure multiparty computation, where each agent has some private
information and the agents want to compute some function of this
information without revealing it. This can be done trivially with a
trusted mediator: the agents just send their private information to the
mediator, who computes the function value and sends it to all of them. Work on multiparty computation conditions under which this can be done
without a mediator, under the assumption that at most a certain fraction
of the agents are faulty, and do not follow the recommended protocol. By way of contrast, game theory is interested in implementing mediators
using what is called “cheap talk”, under the assumption that agents are
rational. We are interested in combining both strands: We consider games
that have (k,t)-robust equilibria when played with a mediator, where an
equilibrium is (k,t)-robust if it tolerates deviations by coalitions of
rational players of size up to k and deviations by up to t players who
can be viewed as faulty (although they can equally well be viewed as
rational players with unanticipated utilities). We prove matching upper
and lower bounds on the ability to implement such mediators using cheap
talk (that is, just allowing communication among the players). The bounds
depend on (a) the relationship between k, t and n, the total number
of players in the system; (b) whether players know the exact utilities
of other players; (c) whether there are broadcast channels or just
point-to-point channels; (d) whether cryptography is available; and (e)
whether the game has a (k+t)-punishment strategy; that is, a strategy
that, if used by all but at most k+t players, guarantees that every player
gets a worse outcome than they do with the equilibrium strategy.

This talk covers joint work Ittai Abraham, Danny Dolev, and Rica Gonen.
No previous background in game theory is assumed.

6/1/2010EE: Meyer 861

Internet Routing: Foundations, Challenges and Future Directions

Dr. Michael Schapira

Yale University and UC Berkeley

Over the past two decades there has been exponential growth in the scale
and complexity of the Internet. However, the protocols that handle routing
on the Internet (e.g., OSPF, BGP) have not changes significantly in
comparison and, consequently, often do not cope well with modern-day
challenges (bounded computational resources, economically driven
manipulations, security attacks, and more). Understanding, “fixing” and
re-designing Internet routing necessitates taking a principled approach
that goes beyond context-specific solutions, bridges theory and systems
research, and breaks traditional disciplinary barriers. I shall present
conceptual frameworks for addressing today’s challenges and novel
implementable routing schemes that improve on the existing ones. I shall
also discuss the surprising implications of these results for other areas
of research: distributed computing, game theory, mechanism design and
complexity theory. Finally, I shall outline interesting directions for
future work.

Talk slides.

30/12/2009EE: Meyer 861

Deep packet inspection for next generation network devices

Dr. Anat Bremler-Barr

Interdisciplinary Center (IDC), Herzliya, Computer Science Dept.

Deep packet inspection (DPI) lies at the core of contemporary Network Intrusion
Detection/Prevention Systems and Web Application Firewall. DPI aims to identify various
malware (including spam and viruses). DPI consists on either exact string or regular
expression pattern matching, which are well-studied problems in Computer Science;
nevertheless, all these solutions are not sufficient for current network demands: First,
current solutions do not scale in terms of speed, memory and power requirements. Second,
non clear-text traffic, such as compressed traffic, becomes a dominant portion of the
Internet and is clearly harder to inspect.

In this talk we discuss the current challenges of performing fast DPI and give some
overview on the recent advances in hardware that help us cope with these demanding
requirements. We present two of our works. In the first work, we propose a novel
technique to compress the Deterministic Finite Automaton (DFA), a commonly used paradigm
for pattern matching. We show that this compression technique enables performing fast
pattern matching with commodity chips that perform fast IP-lookups, thus achieving rate
of 10 Gbps. In the second paper, we present a novel technique that accelerates pattern
matching of compressed HTTP traffic. Surprisingly, we show that in many situations,
pattern matching on compressed data, with the penalty of decompression, is faster than
pattern matching on clear-text traffic.

These works appeared in Infocom 2009 and Infocom 2010. Joint work with Yaron Koral (IDC)
and David Hay (Columbia University).

24/12/2009 11:30
Meyer 1061

DejaView – a Personal Virtual Computer Recorder

Oren Laadan

Columbia University, Computer Science Department

Continuing advances in hardware technology have enabled the
proliferation of faster, cheaper, and more capable personal
computers. As users interact with the world and their peers
through their computers, it is becoming important to archive
and later search the information that they have *viewed*.
Exponential improvements in processing and storage are not
making this problem any easier. Existing state-of-the-art
desktop search tools fail to provide a suitable solution.

DejaView is a personal virtual computer recorder that provides
a complete record of a desktop computing experience. DejaView
continuously records a user’s session to provide a complete
WYSIWYS (What You See Is What You’ve Seen) record and enable
a user to playback, browse, search and revive records. DejaView
combines transparent operating system, display and file system
virtualization, with semantic information recorder and a desktop
search system, to provide this functionality without any chances
to applications, window systems, or operating system kernels.
Experimental results show that DejaView provides continuous
recording without any user noticeable performance degradation,
and is fast enough for interactive use.

23/12/2009EE: Meyer 861

Large-Scale Software Dissemination in Stochastic Wireless Networks

Prof. David Starobinski

Boston University

Commercial providers are increasingly permitting third-parties to develop
and implement their own applications on wireless devices, ranging from sensors
to 3G cellular phones. Accordingly, software dissemination is quickly emerging
as a key enabling technology, providing fundamental services such as
over-the-air programming and security patching. The lossy nature of the
wireless medium joined with the commonly high density of devices hamper
efficient accomplishment of this task, however.

In this work, we address the scalability of reliable software dissemination in
two ways: (i) we develop and analyze policies harnessing the multi-channel
transceiving capabilities of single radio wireless devices, and (ii) we
rigorously address the problem of quantifying forward error correction (FEC)
redundancy to eliminate control traffic with high probability.

We first analyze the performance of software dissemination in multi-channel,
single radio networks. For arbitrary topology networks, we model this problem
as a stochastic shortest path optimization and provide an approach to compute
the optimal control policy achieving minimum average delay in a finite number
of steps using Dijkstra’s algorithm. Due to the prohibitive computational
complexity of the optimal solution, we devise simple, asymptotically optimal
polices for dense networks. Our analysis and simulations reveal that a single
radio in each receiver suffices to achieve performance gain directly
proportional to the number of channels available.

Next, pairing rateless coding with extreme value theory, we quantify FEC
redundancy by analyzing the cumulative distribution function (CDF) of the
completion time. We precisely characterize this CDF and demonstrate the
existence of a phase transition associated with it. We demonstrate the benefit
of accurately quantifying FEC redundancy through real experiments on sensor motes,
which reveal drastic reduction in control traffic, energy consumption and overall
delay.

16/12/2009EE: Meyer 861

Fabric Consolidation – Concepts and Protocols

Dror Goldenberg

Dir. Architecture, Mellanox

Fabric and Server Consolidation is an important trend of today’s growing
data centers. Consolidation brings a huge value, saving on the infrastructure:
equipment cost, power, management, etc. Network speed is an essential factor
for consolidating multiple networks over a single fabric and that is addressed
by InfiniBand and 40/100G Ethernet. There are many other challenges to provide
efficient I/O consolidation solution. These are addressed by various
standardization organizations to include IEEE, PCISIG, IBTA and T11 as well as
commercial pre standard solutions.

In this talk we will present the various concepts that are applied for
efficient consolidation, such as: QoS, partitioning, protocol encapsulation,
RDMA, independent flow control, inter-VM packet switching, and will highlight
on the corresponding standards and pre-standards such as ETS, FCoE, FCoIB, PFC,
VEPA+, RDMAoE, etc.

9/12/2009EE: Meyer 861

Finding what You Needed, didn’t Know Existed and Couldn’t See

Uri Schonfeld

UCLA

Search engines employ huge clusters in an attempt to find the documents they
need to best answer their users’ queries. Unfortunately, crawling and
evaluating the “entire web” is unfeasible. Search engines can identify
queries that can likely be improved but do not know how to efficiently and
reliably find the pages that would best improve the results. Web sites, on
the other hand, have cheap access to their content but do now know what the
search engine’s needs are. The solution we present in this paper is a new
channel of two-way communication channel between the search engine and the
web sites. This new channel makes discovering relevant pages more efficient

2/12/2009EE: Meyer 861

The WiMAX MAC Layer, markets and applications

Nadav Lavi

Alvarion

WiMAX is a Broadband Wireless Access technology based on the IEEE 802.16 standard
that enables the delivery of high-speed personal broadband services to subscribers
anytime, anywhere. Operational in both licensed and license-exempt frequencies,
WiMAX technology offers flexible and economically viable solutions for different
deployment scenarios in urban, suburban and rural environments.

In this talk we will describe the WiMAX MAC layer and network functionalities, and
the various WiMAX markets and applications.

25/11/2009EE: Meyer 861

MSC8156 Freescale Multicore DSP

Dr. Freddy Gabbay

Freescale

Freescale has been recognized for its leadership and increasing commitment to
the broadband wireless industry. The company has been named the award winner
for its MSC8156 Multicore DSP as the Best LTE Chip by a premier publication
in China that covers the central government’s latest strategies of information
technology.

The MSC8156 is the industry’s highest-performance programmable DSP with its new
SC3850 StarCore DSP cores generating 6 GHz – a 2X performance leap over the
main competitor’s latest multicore DSP. Another advantage is its innovative
MAPLE-B acceleration platform. This platform boosts the performance of Forward
Error Correction and Fourier Transforms required in different wireless standards
while offloading DSP cores and eliminating the need for external FPGA to perform
these tasks.

Freescale is the first to launch a device supporting multiple broadband wireless
standards such as TD-SCDMA, TDD-LTE, FDD-LTE, WiMAX and WCDMA. The device enables
smooth migration among standards and is designed to dramatically advance the
capabilities of wireless broadband base station equipment.

18/11/2009EE: Meyer 861

The Mulitcore Streaming Framework

Uzi Shvadron

IBM Research, Haifa

Next generation computing system are going to be significantly different
than the current systems. This is due to two main reasons:
Power consumption restrictions – higher computing rate requires more
parallel processors with scaled power consumption
The memory wall problem – shared memory for multiple processors is not a
suitable option due to access delays of large NUMA memory systems
As a result future computing systems are going to have up to thousands of
very small, specialized, heterogenous cores with a distributed memory
system that provides small amount of local memory for each core. This
imposed design creates significant programmability challenges
The Mulitcore Streaming Framework (MSF) is an implementation of a
programming model for parallel platforms. The main target of the MSF design
is to ease the task of parallel programming. Programming for a multicore
environment must have inherent data and code transfers between various
processors. In many cases the management of these transfers are left to the
programmer, this make parallel programming difficult. MSF provides an
abstraction of these transfers that releases the programmers from handling
the parallel execution and the data transfers that are associated with the
execution.

Programming using the MSF is based on a concept of data streaming between
tasks. Nevertheless, MSF can be used for none-streaming type of
applications as well. By using the MSF API the programmer provides a set of
tasks and the data dependency connections graph between them. Thereafter
MSF is capable to execute the tasks as long as data is available for them.
Tasks are independent components of code that may process large amount of
input data and produce large amount of output data using the provided API.
A task can run on any of the available processors in the multicore
processor. MSF provides and architecture independent interface. It can be
implemented on various systems with shared or distributed memories, and can
be scaled to a multiprocessor systems.

The well structured programming model of MSF allows the use of MSF by
compilers that rely on the provided runtime code to execute programs that
were programmed for sequential programs. Initial results in these area have
been achieved.

Talk slides.

11/11/2009EE: Meyer 861

Group Scalability in Distributed Systems

Dr. Ymir Vigfusson

IBM Research, Haifa

We address shortcomings of two important group communication
paradigms, IP Multicast and gossip based message dissemination, both
of which have scalability issues when the number of groups grows.

First, we propose a transparent and backward-compatible layer called Dr.
Multicast to allow data center administrators to enable IPMC for large
numbers of groups without causing scalability issues. Dr. Multicast
optimizes IPMC resources by grouping together similar groups in terms
of membership to minimize redundant transmissions as well as the cost of
filtering unwanted messages.

Second, we argue that when nodes belong to multiple groups, gossip
based communication loses its appealing property of using a fixed amount
of bandwidth. We propose a platform called GO (Gossip Objects) that
bounds each node’s bandwidth use to a customizable limit, prohibiting
applications from joining groups that would cause the limit to be
exceeded. We make the observation that gossip rumors tend to be small,
and propose a utility-based heuristic to stack rumors into packets to
optimize delivery speed, with rumors sometimes traveling through indirect
paths.

Features joint work with Hussam Abu-Libdeh, Mahesh Balakrishnan, Ken
Birman, Robert Burgess, Gregory Chockler, Qi Huang, Jure Leskovec, Haoyuan
Li, Deepak Nataraj and Yoav Tock.

4/11/2009EE: Meyer 861

Algebraic Gossip via EXCHANGE: Analytical and Simulation Results

Dr. Chen Avin

Communication Systems Engineering Department, Ben Gurion University of the Negev

In this work we study the stopping times of gossip algorithms for network coding. We analyze Algebraic gossip (i.e., random linear coding) and consider three gossip algorithms for information spreading Pull, Push and Exchange. While traditionally Algebraic gossip protocols used Pull or Push, we prove that using the Exchange algorithm can be unboundedly better. The stopping time of Algebraic gossip is known to be linear for the complete graph, but the question of determine a tight upper bound for general graphs is still open. We take a major step in solving this question, and prove a linear tight upper bound for bounded degree graphs and a tight lower bound of n^2 for general graphs.

Our proofs uses a new method that relies on Jackson’s queueing theorem to analyze the stopping time of network coding, and the technique is likely to turn useful for future research.

In addition, we compare algebraic gossip to the average computation task via Exchange and show that there are graphs for which algebraic gossip can be unboundedly faster.

Joint work with: Michael Borokhovich and Zvi Lotker.

28/10/2009EE: Meyer 861

Self-Stabilizing and Byzantine Fault Tolerance

Ezra Hoch

School of Computer Science and Engineering, The Hebrew University of Jerusaelm

In this talk I will discuss a few results that combine self-stabilizing and
Byzantine fault tolerance. I will concentrate on the simpler model of synchronous message
passing, and discuss the major tools that are used to achieve the
results.

The three main results I will discuss are:
1. Self-stabilizing Byzantine Digital Clock Synchronization (SSS 06)
2. On Self-stabilizing Synchronous Actions Despite Byzantine Attacks (DISC
07)
3. Fast Self-stabilizing Byzantine Tolerant Digital Clock Synchronization
(PODC 08)

21/10/2009Meyer 861

No Need to Constrain Many-Core Parallel Programming

Prof. Uzi Vishkin

University of Maryland Institute for Advanced Computer Studies

The transition in mainstream computer science from serial to parallel
programming for many-core on-chip computing offers parallel computing research
the wonderful impact opportunity it had sought all along. However, such
transition is a potential trauma for programmers who need to change the basic
ways in which they conduct their daily work. The long experience with
multi-chip parallel machines only adds to the apprehension of today’s
programmers. Many people who tried (or deliberated trying) to program these
multi-chip machines consider their programming ‘as intimidating and time
consuming as programming in assembly language’ (–2003 NSF Cyberinfrastructure
Blue Ribbon Committee), and have literally walked away. Programmers simply did
not want to deal with the constraint of programming for locality in order to
extract the performance that these machines promise. Consequently, their use
fell far short of historical expectations. Now, with the emerging many-core
computers, the foremost challenge is ensuring that mainstream computing is not
railroaded into another major disappointment. Limiting many-core parallel
programming to more or less the same programming approaches that dominated
parallel machines could again: (i) repel programmers; (ii) reduce productivity
of those programmers who hold on: getting the performance promise requires high
development-time and leads to more error-prone code; (iii) raise by too much
the minimal professional development stage for introducing programmers to
parallel programming, reducing further the pool of potential programmers; and
overall (iv) fail to meet expectations regarding the use of parallel
computing; only this time for many-cores.

The talk will overview a hardware-based PRAM-On-Chip vision that seeks to
rebuild parallel computing from the ground up. Grounded in the richest and
easiest known theory of parallel algorithms, known as PRAM, where the
programmer only needs to identify at each step operations that can be executed
concurrently, an on-chip architecture that scales to thousands of processors
on chip called XMT (for explicit multi-threading) was introduced. Significant
hardware and software prototyping of XMT will be reported, including a
64-processor FPGA-based machine and two ASIC chips fabricated using 90nm CMOS
technology, as well as strong speedups on applications. By having XMT
programming taught at various levels from rising 6th graders to graduate
students, we developed evidence that the stage at which parallel programming
can be taught is earlier than demonstrated by other approaches. For example,
students in a freshman class were able to program 3 parallel sorting
algorithms. Software release of the XMT environment can be downloaded to any
standard PC platform along with extensive teaching materials, such as
video-recorded lectures of a one-day tutorial to high school students and a
full-semester graduate class, class notes and programming assignments.
Preliminary thoughts on encapsulating XMT into a hardware-enhanced
programmer’s workflow will also be presented and the prospects for
incorporating it as an add-on into some other many-core designs be discussed.

14/10/2009EE: Meyer 861

Lessons from Winning the Grand Prize at the Netflix Competition

Dr. Yehuda Koren

Sr. Research Scientist, Yahoo! Research

The collaborative filtering approach to recommender systems predicts user
preferences for products or services by learning past user-item relationships.
Their significant economic implications made collaborative filtering techniques play
an important role at known e-tailers such as Amazon and Netflix. This field enjoyed
a surge of interest since October 2006, when the Netflix Prize competition was
commenced. Netflix released a dataset containing 100 million anonymous movie
ratings and challenged the research community to develop algorithms that could beat
the accuracy of its recommendation system, Cinematch. In this talk I will survey the
competition together with some of the principles and algorithms, which have led us
to winning the Grand Prize in the competition.

30/06/2009 8:30
Meyer 1003

Med-Hoc-Net 2009 Keynote: Vehicular Urban Sensing: Dissemination and Retrieval

Prof. Mario Gerla

UCLA, Computer Science Department

There has been growing interest in vehicle to vehicle communications for a broad range
of applications ranging from safe driving to content distribution, advertising, commerce and games. One
emerging application is urban sensing. Vehicles monitor the environment, classify the events, e.g., license
plates, pollution readings, etc. and exchange metadata with neighbors in a peer-to-peer fashion, creating a
distributed index from which mobile users can extract different views. For instance, the Department of
Transportation captures traffic statistics; the Department of Health monitors pollutants; and Law
Enforcement Agents investigate crimes. Mobile, vehicular sensing differs significantly from conventional
wireless sensing. Vehicles have no strict limits on battery life, processing power and storage capabilities.
Moreover they can generate enormous volumes of data, making conventional sensor data collection inadequate.
In this talk we first review popular V2V applications and then introduce MobEyes, a middleware solution that
diffuses data summaries to create a distributed index of the sensed data. We discuss the challenges of
designing and maintain such a system, from information dissemination to harvesting, routing and privacy.

29/06/2009 9:30
Meyer 1003

Med-Hoc-Net 2009 Opening Talk: Cognitive Networks – Throughput, Delay Bounds and Routing Issues

Prof. Luigi Fratta

Politecnico di Milano, Dipartimento di Elettronica e Informazione

Cognitive Radio Networks (CRNs) are composed of frequency-agile radio devices that allow licensed
(primary) and unlicensed (secondary) users to coexist, where secondary users opportunistically
access channels without interfering with the operation of primary ones. From the perspective of
secondary users, spectrum availability is a time varying network resource over which multi-hop
end-to-end connections must be maintained.

Analytical bounds on throughput and transmission delay of secondary users under different assumptions
on secondary and primary users traffic statistics in a single channel scenario will be presented.

Then it will be discussed the problem of routing secondary user flows in a CRN with the aim of
characterizing optimal sequences of routes over which a secondary flow is maintained. The optimality
is defined according to a novel metric that considers the maintenance cost of a route as channels
and/or links must be switched due to the primary user activity. Different from the traditional routing
problem, the proposed approach considers subsequent path selections. The problem can be formulated as
an integer programming optimization model and shown to be of polynomial time complexity in case of full
knowledge of primary user activity. The use of heuristic algorithms, to solve the minimum maintenance
cost routing problem when information on primary user activity is not complete, will be discussed. Some
numerical results will allow to assess the optimality gap of a proposed heuristic routing algorithm.

24/06/2009EE: Meyer 861

External Mining of Search Query logs

Maxim Gurevich

Technion, Electrical Engineering Faculty

Search query logs are valuable data sources that are kept confidential by search engines, in order to protect
their users’ privacy. Some search engines disclose aggregate statistics about queries via services, like Google
Trends and Google Trends for Websites. The information provided by these services, however, is obfuscated,
non-repeatable, and partial. For example, statistics for medium to low volume queries and sites are not readily
available. In this talk I will describe algorithms for “external” mining of search query logs. Our algorithms
can be used to estimate the popularity of queries in the log and the amount of impressions web sites receive
from search results. The algorithms use only public search engine services, like the web search service and the
query suggestion service, and thus do not require privileged access to confidential search engine data sources.
In addition, the algorithms use modest resources, and hence can be used by anyone to gather statistics about
any query and/or any web site. Our algorithms rely on tools from information retrieval (keyword extraction),
statistics (importance sampling), and databases (tree volume estimation).

The talk will be self-contained.

Based on joint work with Ziv Bar-Yossef.

17/06/2009EE: Meyer 861

ETNA architecture for Inter Domain Transport Network

David Berechya

Nokia Siemens

Inter Domain Transport Network is an architecture for automatic inter domain provisioning of transport services. ETNA emphasizes
and analyses the case of Inter Carrier, i.e. different domains operated by different carriers. This innovative method developed
by ETNA enables efficient and low cost pan-European and even world wide transport services. Today, provisioning of inter carrier
transport involves time-consuming manual intervention, a process that lasts days or even weeks. Automatic inter carrier
provisioning will shorten the provisioning period dramatically and will decrease the OPEX involved significantly. The ETNA Inter
domain transport concept works for any domain technology like IEEE based transport (PBB, PBB-TE), MPLS, MPLS-TP etc., this concept
considers the fact that the pan European transport is composed of several transport technologies operated by different carriers.

10/06/2009EE: Meyer 861

Lower bounds on the Complexity of Set-Agreement in Partially Synchronous Systems

Dr. Corentin Travers

Technion, Computer Science Department

Set agreement is a fundamental problem in distributed computing in which processes collectively
choose a small subset of values from a larger set of proposals. The impossibility of fault-tolerant
set agreement in asynchronous networks is one of the seminal results in distributed computing. In
synchronous networks, too, the complexity of set agreement has been a signifcant research challenge
that has now been resolved. Real systems, however, are neither purely synchronous nor purely asynchronous.
Rather, they tend to alternate between periods of synchrony and periods of asynchrony.
Nothing specific is known about the complexity of set agreement in such a “partially synchronous”
setting.

In this paper, we address this challenge, presenting the first (asymptotically) tight bound on the
complexity of set agreement in such systems. While doing so, we highlight close connections between
the basic models of distributed computation: synchronous, partially synchronous, and asynchronous.
We introduce a technique for simulating, in a fault-prone asynchronous shared memory, executions
of an asynchronous and failure-prone message-passing system in which some fragments appear synchronous
to some processes. We use this simulation technique to derive a lower bound on the round complexity of
set agreement in a partially synchronous system by a reduction from asynchronous wait-free set agreement.

Specifically, we show that every set agreement protocol requires at least floor(t/k) + 2 synchronous rounds
to decide. We present an (asymptotically) matching algorithm that relies on a distributed asynchrony
detection mechanism to decide as soon as possible during periods of synchrony, that is, in floor(t/k) + 4
synchronous rounds. From these two results, we derive the size of the minimal window of synchrony
needed to solve set agreement. By relating synchronous, asynchronous and partially synchronous
environments, our simulation technique is of independent interest. In particular, it allows us to
re-derive and extend existing lower bounds on the complexity of synchronous early-deciding set agreement.

joint work with Dan Alistarh, Seth Gilbert and Rachid Guerraoui EPFL

27/05/2009EE: Meyer 861

TCP Starvation in Routers

Alex Shpiner

Technion, Electrical Engineering Faculty

Starvation can have a dramatic impact on application performance. To avoid it, network-based
applications often rely on TCP, a reliable transport-layer protocol that is defined to adjust
itself to networks with variable environment, like link capacities, propagation times and
buffer sizes. Many researchers have analyzed this protocol and proved its capabilities using
ideal router models in global networks. However, we find that both real routers and datacenter
networks may lead to starvation and unfairness between the TCP flows.

In this talk, we first analyze the interactions of user-based congestion control algorithms and
router-based switch scheduling algorithms. Previous papers neglected or mitigated these
interactions, and typically found that flow rates reach a fair equilibrium. On the contrary, we
show that these interactions can lead to extreme unfairness with temporary flow starvation, as
well as to large rate oscillations. For instance, we prove that this is the case for the MWM
switch scheduling algorithm, even with a single router output and basic TCP flows. In addition,
we fully characterize the network dynamics for both these switch scheduling algorithms.

In the second part we talk about using TCP in the network architecture of the datacenters.
Datacenters are a challenging environment for TCP flows. They are characterized by very short
round trip times and a large number of flows. Thus, their short links are not long enough to
contain packets from all the flows. As a result, we show that TCP-based datacenter networks can
suffer from high unfairness and starvation. We present a novel algorithm called HCF (Hashed
Credits Fair) which is simple to implement. We show by simulations, that this algorithm
improves the fairness and reduces the starvation, yet keeps a maximal possible throughput.

M.Sc. research under the supervision of Dr. Isaac Keslassy.

20/05/2009Meyer 1061

Can IP Networks be merged with Optical Networks?

Dr. Ori Gerstel

CISCO

Core IP networks have been using optical wavelength multiplexing (DWDM) technology in most networks across
the globe. As bandwidth needs continue to increase, the limits of electrical processing in routers in reached
and the simple usage of optics by routers no longer suffices. As a result, more dynamic optical networking
must be deployed to exploit the vast scale that optics provides. However, such a technology comes with many
challenges that must be first overcome. This talk will cover the current evolution from static point-to-point
DWDM networks to a fully switched network that better serves the IP layer. We will then propose a possible next
phase in this evolution – from fixed usage of optical spectrum to dynamic usage of spectrum.

13/05/2009EE: Meyer 861

O(log N / log log N) RMRs Randomized Mutual Exclusion

Dr. Danny Hendler

Ben-Gurion University, Department of Computer Science

Mutual exclusion is a fundamental distributed coordination problem. Shared-memory mutual exclusion research focuses on local-spin
algorithms and uses the remote memory references (RMRs) metric. An RMR is a shared memory access that cannot be served from a
process’ local cache.

A recent proof by Attiya, Hendler and Woelfel established an \Omega(log N) lower bound on the number of RMRs incurred by processes
as they enter and exit the critical section, matching an upper bound by Yang and Anderson. Both these bounds apply for N-process
algorithms that only use read and write operations.

The lower bound of Attiya et al. only holds for deterministic algorithms, however; the question of whether randomized mutual
exclusion algorithms, using reads and writes only, can achieve sub-logarithmic expected RMR complexity remained open. In this talk,
I will present recent work that answers this question in the affirmative.

We present two strong-adversary randomized local-spin mutual exclusion algorithms. In both algorithms, processes incur
O(log N / log log N) expected RMRs per critical section passage. Our first algorithm has sub-optimal worst-case RMR complexity of
O( (log N / log log N)^2 ). Our second algorithm is a variant of the first that can be combined with a deterministic algorithm,
such as that of Anderson and Yang, to obtain O(log N) worst-case RMR complexity. The combined algorithm thus achieves sub-logarithmic
expected RMR complexity while maintaining optimal worst-case RMR complexity.

The talk will be self-contained and no familiarity with distributed computing theory will be assumed.

This is joint work with Philipp Woelfel.

1/04/2009EE: Meyer 861

Cryptography and security, from theory to practice

Dr. Benny Pinkas

Haifa University, Department of Computer Science

Research in cryptography and security has an incredible potential for fruitful interaction between theory and practice, as was
demonstrated by the very successful deployment of public key technology. On the other hand, there has been much less interaction
between theory and practice in other areas of cryptographic research. For instance, cryptographers have devised procedures for
performing many seemingly impossible tasks, using zero-knowledge proofs and protocols for secure multi-party computation, but
none of these procedures is in daily use.

This talk will describe two research efforts which aim to close the gap between the theory and practice of cryptography. The
first is in secure multi-party computation, a technology which enables a set of untrusting parties to compute any function of
their private inputs while revealing nothing but the result of the function. We will describe recent advances in implementing
secure computation, involving both theoretical research and implementation experiments.

The second area of research is an investigation of the security of the pseudo-random generators of operating systems, which is
crucial for the security of almost any security application. We examined the pseudo-random number generators of the Linux and
Windows operating systems, whose details were never revealed before our work. In both cases we presented weaknesses and attacks,
with the goal of motivating manufacturers to improve the security of these elements.

25/03/2009Meyer 1061

Dynamics of Peer-to-Peer Networks, or Who is Going to be The Next Pop Star?

Prof. Yuval Shavitt

Tel Aviv University, School of Electrical Engineering

P2P file-sharing applications are quickly being adopted by a wider and more
mainstream audience. There is much to be learned from keyword searches
users perform in order to retrieve content from these networks.

This talk presents an unprecedented large-scale (both in time and volume)
measurement study of search terms in the modern Gnutella network. We
analyzed both static snapshots of the Gnutella networks, as well as the dynamics
of the network over time. In particular, we show how analysis of the query string
over time and space (geography) can help us detect with high success
rate (over 20% on average and about 50% at peak) emerging artists in the US.

The talk is based on two papers co-authored with Tomer Tankel, Noam
Koenigstein, and Adam Shaked-Gish.

The second paper was recently receiving large media exposure both in Israel
(Yediot Aharonot, glz, Channel 2) and abroad (Popular Science, The New
Scientist, Daily Telegraph, Seed Magazine, MSNBC and more).

18/03/2009Meyer 1061

Approximate Shared-Memory Counting Despite a Strong Adversary

Keren Censor

Technion, Computer Science Department

Most randomized consensus algorithms embed a shared-coin algorithm that allows the processes to flip a coin, with constant probability for
all processes agreeing on each of the outcomes. To get good guarantees of the shared coin, these implementations require the processes to
count.

A new randomized asynchronous shared-memory data structure is given for implementing an approximate counter that can be incremented up to
n times. For any fixed epsilon, the counter achieves a relative error of delta with high probability, at the cost of
O(((1/delta) log n)^{O(1/epsilon)}) register operations per increment and O(n^{4/5+epsilon}((1/delta) log n)^{O(1/epsilon)}) register
operations per read. The counter combines randomized sampling for estimating large values with an expander for estimating small values.
This is the first sublinear solution to this problem that works despite a strong adversary scheduler that can observe internal states of
processes.

An application of the improved counter is an improved protocol for solving randomized shared-memory consensus, which reduces the best
previously known individual work complexity from O(n log n) to an optimal O(n), resolving one of the last remaining open problems
concerning consensus in this model.

The talk is based on joint work with Hagit Attiya and James Aspnes [PODC 2008, SODA 2009].

11/2/2009EE: Meyer 861

To Peer or not to Peer? That’s the question!

Prof. Fernando Kuipers

Delft University of Technology, Network Architectures and Services group

P2P technology has been a popular means of sharing files over the Internet. With the increase in popularity of video-based
services, the question emerges whether P2P technology will also become a key technology for delivering real-time video services.
These services can be delivered (1) via a dedicated network that is under full control by a service provider, or (2) via a
Peer-to-Peer protocol over the Internet, which means less control but better scalability. Many operators are currently
struggling with the question whether to adopt P2P technology or not. An important criterion in this decision process is the
quality of the services as experienced by their end-users. In this talk we will provide more insight into this matter by
studying the Quality of Experience of a P2PTV application called SopCast.

28/1/2009EE: Meyer 861

Mitigating Congestion in High-Speed Interconnects for Computer Clusters

Vladimir Zdornov

The Technion, EE faculty

In large computer clusters such as nearly all current super computers, congestion is an inherent problem, which arises due to oversubscription of communication resources. Moreover, the limited number of buffers results in rapid “congestion spreading”, whereby communication along paths that do not traverse congested links is also severely degraded.

Cluster interconnection networks constitute a managed environment, in which the behavior of all elements can be defined to implement new protocols without jeopardizing the interoperability. This property, combined with very low delays and reliable communication, creates an opportunity for application of sophisticated control mechanisms aimed to avoid damage caused by congestion-related phenomena.

In this work, we propose novel mechanisms for adaptive routing and rate control, and study them under performance goals specific to cluster networks.

Our adaptive routing uses virtual circuits to guarantee in-order packet delivery. The path setup process is performed by switches locally. Importantly, we avoid blocking and backtracking. The scheme is shown to reduce the maximum contention by up to 50% in slightly enriched fat tree topology, for random permutation traffic.

Unlike most current rate control schemes such as those used in TCP and in InfiniBand, our proposed rate control schemes rely on explicit rate calculation. We formulate three different performance goals and present distributed algorithms that achieve them. Simulation results of synthetic benchmarks demonstrate the algorithms to have significant impact on the relevant measures. In addition, we prove tight theoretical bounds that reflect the behavior of the algorithms under unsuitable performance goals (among the three defined).

M.Sc. research under the supervision of Prof. Yitzhak Birk

14/1/2009EE: Meyer 861

Toward Task Completion Assistance in Web Information Retrieval

Dr. Ronny Lempel

Director of Yahoo! Research, Haifa

This talk will describe some of the challenges currently addressed by Web Information Retrieval systems, namely search engines. The first generation of search engines closely resembled classic IR systems in their goal to return relevant documents to a user’s query. Those systems relied mostly on on-page contents. The second generation of search engines added the links between Web pages to the mix, and differentiated between several types of information needs. Today’s engines tap huge amounts of user generated content, and focus on helping users to complete tasks. This involves both assisting users to break down their tasks into sub-tasks, as well as interpretation, aggregation and integration of diverse content that allows users to more readily digest complex information. The talk will explore some of the search assistance tools available today and will highlight some of the technical challenges in their development.

7/1/2009EE: Meyer 861

Audio Video Services in Home Network: Overview of the MoCA protocol and 802.1 AVB

Dr. Philippe Klein

Associate Technical Director, Broadcom

MoCA is the standard for home entertainment networks over coax and allows for the distribution of multiple HD video streams,
multi-room gaming and whole-home DVR services in the home. Broadcom is a board member of the MoCA alliance and an active contributor to the MoCa specifications.

The IEEE 802.1 Audio/Video Bridging Task Group charter is to provide the specifications that will allow end to end QoS and time-synchronized low latency streaming services through 802 (802.3, 802.11) and “Coordinated Shared” networks. Broadcom chairs the IEEE 802.1 AVB TG and is an active contributor to its specifications.

http://www.mocalliance.org
http://www.ieee802.org/1/pages/avbridges.html

 


31/12/2008EE: Meyer 861

Startup – Israel’s National Sport

Ofir Zohar

VP XIV, IBM

“In the talk ‘Startup – Israel’s National Sport’ I will talk about the
principles of building a startup and making a successful exit while telling
the story of XIV and using analogies from the world of sports”.

Ofir Zohar is the ex-CEO of XIV – a startup which was acquired by IBM in December
2007 for $300 million. XIV develops high-end enterprise storage services.

http://it.themarker.com/tmit/article/4498

 


24/12/2008 12:30Meyer 815

FPGA Technology

Udi Landen

VP of Software and IP Engineering, Altera

  • Silicon Technology Evolution
  • Why Use an FPGA?
  • Altera Overview
  • Architecting FPGA Architectures and CAD Tools
  • Leveraging FPGA Technology for Teaching
  • Future Directions

24/12/2008 11:30Meyer 1007

Rate and Delay Controlled Core Networks: A Theory and a WAN Testbed Demonstration

Dr. Zvi Rosberg

CSIRO ICT Centre, Australia

Growing demand for streaming, voice and interactive gaming applications emphasize the importance of quality of service (QoS) provisioning in the Internet; particularly the need for maximum end-to-end delay and jitter guarantee as well as for packet policing at the network edge. Current methods of QoS provisioning have either scalability concern (IntServ) or cannot guarantee end-to-end delay and jitter with acceptable packet loss, unless bandwidth is over-provisioned (DiffServ). Furthermore, they cannot adapt the influx traffic rate at the network edge so as to protect the QoS of running applications.

I will present three combined rate and end-to-end delay controls and discuss their stability properties in a fluid network model. I will also present the implementation in network routers of one of the controls and demonstrate its stability and convergence in our WAN testbed.

To further demonstrate that a combined rate and end-to-end delay control can address the current DiffServ issues, we use the control as a building block for several shell protocols around the DiffServ services. The advantage over the native DiffServ in our WAN testbed will also be presented using real video/voice/data applications demonstrating the vitality of our solution.

17/12/2008EE: Meyer 861

RFIDs and Swarms security schemes

Prof. Shlomi Dolev

The Computer Science Department, Ben Gurion University

Two recent works on security in the RFID and swarm young domains will be described.

For RFIDs: We consider repeated communication sessions between a RFID Tag (e.g., Radio Frequency Identification, RFID Tag) and a RFID Verifier. A proactive information theoretic security scheme is proposed. The scheme is based on the assumption that the information exchanged during at least one of every n successive communication sessions is not exposed to an adversary. The Tag and the Verifier maintain a vector of n entries that is repeatedly refreshed by pairwise xoring entries, with a new vector of n entries that is randomly chosen by the Tag and sent to the Verifier as a part of each communication session. A computational secure scheme which is based on the information theoretic secure scheme is used to ensure that even in the case that the adversary listens to all the information exchanges, the communication between the Tag and the Verifier is secure.

For Swarms: Secret sharing is a fundamental cryptographic task. Motivated by the virtual automata abstraction and swarm computing, we investigate an extension of the k-secret sharing scheme, in which the secret shares are changed on the fly, independently and without (internal) communication, as a reaction to a global external trigger. The changes are made while maintaining the requirement that k or more secret shares may reconstruct the secret and no k – 1 or fewer can do so. The application considered is a swarm of mobile processes, each maintaining a share of the secret which may change according to common outside inputs, e.g., inputs received by sensors attached to the process. The proposed schemes support addition and removal of processes from the swarm, as well as corruption of a small portion of the processes in the swarm.

The talk is based on joint works with Marina Kopeetsky, Adi Shamir, Limor Lahiani and Moti Yung

10/12/2008Meyer 1061

A Secure Cryptographic Token Interface

Dr. Christian Cachin

IBM Zurich Research Laboratory

Cryptographic keys must be protected from exposure. In real-world
applications, they are often guarded by cryptographic tokens that
employ sophisticated hardware-security measures. Several logical
attacks on the key management operations of cryptographic tokens
have been reported in the past, which allowed to expose keys merely
by exploiting the token API in unexpected ways.

This talk presents a novel, provably secure, cryptographic token
interface that supports multiple users, implements symmetric
cryptosystems and public-key schemes, and provides operations for
key generation, encryption, authentication, and key wrapping. The
token interface allows only the most important operations found in
real-world token APIs; while flexible to be of practical use, it is
restricted enough so that it does not expose any key to a user
without sufficient privileges. An emulation of the security policy
in the industry-standard PKCS \#11 interface is demonstrated.

Based on joint work with Nishanth Chandran.

3/12/2008 14:30EE: Meyer 861

Portably Preventing File Race Attacks with User-Mode Path Resolution

Dr. Dan Tsafrir

IBM T. J. Watson Research Center

The filesystem API of contemporary systems exposes programs to TOCTTOU (time of check to time of use) race-condition vulnerabilities, which occur between pairs of check/use system calls that involve a name of a file. Existing solutions either help programmers to detect such races (by pinpointing their location) or prevent them altogether (by altering the operating system). But the latter alternative is not prevalent, and the former is just the first step: programmers must still address TOCTTOU flaws within the limits of the existing API with which several important tasks cannot be safely accomplished in a portable straightforward manner. The recent “filesystem maze” attack further worsens the problem by allowing adversaries to deterministically win races and thus refuting the common perception that the risk is small. In the face of this threat, we develop a new algorithm that allows programmers to effectively aggregate a vulnerable pair of distinct system calls into a single operation that is executed “atomically”. This is achieved by emulating one kernel functionality in user mode: the filepath resolution. The surprisingly simple resulting algorithm constitutes a portable solution to a large class of TOCTTOU vulnerabilities, without requiring modifications to the underlying operating system.

Joint work with Tomer Hertz (Microsoft Research), David Wagner (Berkeley), and Dilma Da Silva (IBM T.J. Watson Research Center).Based on
http://www.usenix.org/events/fast08/tech/tsafrir.html

USENIX File & Storage Technologies (FAST’08). Awarded best paper.

3/12/2008 11:30EE: Meyer 861

Robust and efficient TCAM-based classification

Dr. David Hay

The Electronic Department, Politecnico di Torino

Ternary content-addressable memory (TCAM) devices are increasingly used
for performing high-speed packet classification.
A TCAM consists of an associative memory that compares a search key in
parallel against all entries and thus provides high throughput.

In this talk we tackle two challenges related to these devices.

In the first part, we will consider the representation of rules that
contains range fields (e.g. source or destination ports).
Prior art algorithms typically represent each such rule by multiple TCAM
entries, which results in a range expansion
that dramatically reduce TCAM utilization. We present a novel scheme for
constructing efficient representations
of range rules, making use of extra bits, available in each TCAM
entry.By extensive comparative analysis on real-life
classification databases, we establish that our scheme reduces the
number of redundant TCAM entries
caused by range rules by more than 60% as compared with best
range-encoding prior art.

In the second part, we present PEDS, a novel parallel error detection
scheme that locates the erroneous entries in a TCAM device.
PEDS is based on applying an error-detection code to each TCAM entry,
and utilizing the parallel capabilities of the TCAM, by
simultaneously checking the correctness of multiple TCAM entries. A key
feature of PEDS is that the number of TCAM lookup
operations required to locate all errors depends on the number of
symbols per entry rather than the (orders-of-magnitude larger)
number of TCAM entries. PEDS allows flexible and dynamic selection of
trade-off points between robustness, space
complexity, and number of lookups.

These are joint works with Anat Bremler-Bar (IDC, Israel), Danny Hendler
(BGU, Israel) and Ronny Roth (Technion, Israel), and
are supported by a Cisco grant.

26/11/2008EE: Meyer 861

Improved Distributed Approximate Matching

Prof. Boaz Patt-Shamir

Department of Electrical Engineering, Tel Aviv University

We present improved algorithms for finding approximately optimal matchings in both weighted and unweighted graphs. For unweighted graphs, we give an algorithm providing (1-epsilon)-approximation in O(log n) time for any constant epsilon>0. This result improves on the classical 1/2-approximation due to Israeli and Itai. As a by-product, we also provide an improved algorithm for unweighted matchings in bipartite graphs. In the context of weighted graphs, we give an algorithm which provides (1/2-epsilon)-approximation in general graphs in O(log n) time. The latter result improves on the known (1/4-epsilon)-approximation in O(log n) time, or (1/2-epsilon)-approximation in O(log^2 n) time.

joint work with Zvi Lotker (Ben Gorion U.) and Seth Pettie (U. Michigan)

19/11/2008EE: Meyer 861

Establishing Multiple Survivable Connections

Michael Keslassy

The Technion, EE faculty

With the increasing use of telecommunication networks, the ability of a network to recover from failures becomes a crucial requirement. In particular, link failures can be catastrophic in networks that are capable of transmitting very large amounts of data on each link. Accordingly, various routing schemes have been proposed for supporting network survivability. Such schemes handle link failures by rerouting connections through alternative paths. Survivable networks have been studied extensively in the context of optical wavelength division multiplexing networks (WDM) and multiprotocol label switching (MPLS) protection.

We consider in this talk the establishment of multiple connections in survivable networks, where each connection is allocated with a working (primary) path and a protection (backup) path. While many optimal solutions have been found for the establishment of a single connection, only heuristics have been proposed for the multi-connection case.

Our talk will start with the basic, but yet already complex case where two connections are to be established. A differentiation will be made between online and offline frameworks, and between full and partial reliability. We will then extend our analysis to a K connection problem, where K>2 is a fixed number and each connection needs to be fully reliable. For online and offline variants, optimal solutions of polynomial complexity will be presented. Finally, the intractability of some extensions of the problem will be shown.

MSc. research under the supervision of Prof. Ariel Orda.

12/11/2008EE: Meyer 861

Parallel Job Scheduling and Workloads

Prof. Dror Feitelson

School of Computer Science and Engineering, The Hebrew University of Jerusalem

Parallel job scheduling is concerned with the scheduling of parallel jobs — jobs composed of multiple interacting processes — on large-scale parallel machines or clusters. New scheduling schemes are typically evaluated using simulations. This has to be done using representative workloads, that have similar characteristics to the workloads on real production parallel systems. Many studies therefore use actual workload traces from production systems to drive the simulations. However, using such traces may lead to unforeseen problems. This talk will start with a very simple approach to parallel job scheduling, called backfilling, which is in wide use today. Backfilling requires users to provide runtime estimates for submitted jobs, so that the scheduler will be able to plan its actions. Surprisingly, the results of performance evaluations are very sensitive to models of how such estimates are generated. We’ll then review the importance of feedback in workloads, where the submittal of additional jobs depends on the performance of the system when servicing previous jobs. Regrettably, feedback data is not included explicitly in workload traces, so we need models of user behavior to extract it. Modeling user behavior such as feedback and daily cycles of activity is important not only for evaluations, but also as an enabler of novel approaches. Finally, we’ll demonstrate the need to clean workload data before it is used in evaluations.

Skip to content